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(54) Digital receiving apparatus and method 

(57) Upon reproducing a TV program, it is a com- 
mon practice to display video data sent from a broadcast 
station as it is, and the display pattern (layout) is not 
effectively changed (e.g., an object in video data is 
erased, or the object size is changed). A program ID 
from additional data contained in received TV informa- 
tion is detected, and when layout setting data corre- 



sponding to the detected program ID is stored in a mem- 
ory, the corresponding layout setting data is read out 
from the memory to display program video data in the 
set layout. When a new layout is set, the user selects 
an object for which a layout is to be adjusted from ob- 
jects that form image data in TV information, and adjusts 
movement, upscaling/downscaling, display ON/OFF of 
the selected object. 
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Description 

[0001] The present invention relates to a receiving ap- 
paratus and method and, more particularly, to a receiv- 
ing apparatus which can receive a digital television 
broadcast signal and can reproduce image and sound 
data, and its method. 

[0002] In recent years, digital television broadcast us- 
ing a satellite broadcast or cable broadcast system has 
been started. Upon implementation of digital broadcast, 
many effects such as improvement of qualities of image 
and sound data including audio data, increases in the 
number of kinds and volume of programs exploiting var- 
ious compression techniques, provision of new services 
such as an interactive service and the like, advance of 
the receiving pattern, and the like, are expected. 
[0003] Fig. 1 is a block diagram showing the arrange- 
ment of a digital broadcast receiver 10 using satellite 
broadcast. 

[0004] A television (TV) broadcast wave transmitted 
from a broadcast satellite is received by an antenna 1 . 
The received TV broadcast wave is tuned by a tuner 2 
to demodulate TV information. After that, an error cor- 
rection process, and a charging process, descramble 
process, and the like if necessary are done, although 
not shown. Various data multiplexed as the TV informa- 
tion are demultiplexed by a multiplexed signal demulti- 
plexer 3. The TV information is demultiplexed into image 
information, sound information, and other additional da- 
ta The demultiplexed data are decoded by a decoder 
4. Of the decoded data, image information and sound 
information are converted into analog data by a D/A con- 
verter 5, and these data are reproduced by a television 
receiver (TV) 6. On the other hand, the additional data 
has a role of program sub-data, and is associated with 
various functions. 

[0005] Furthermore, a VTR 7 is used to record/repro- 
duce the received TV information. The receiver 10 and 
VTR 7 are connected via a digital interface such as 
IEEE1 394 or the like. The VTR 7 has a recording format 
such as a digital recording system, and records TV in- 
formation as bitstream data based on, e.g., D-VHS. 
Note that TV information of digital TV broadcast can be 
recorded not only by bitstream recording based on D- 
VHS, but also by the digital Video (DV) format as anoth- 
er home-use digital recording scheme, or digital record- 
ing apparatuses using various disk media. In such case, 
format conversion may often be required. 
[0006] When a TV program in grou nd wave broadcast 
or digital TV broadcast is reproduced by a home televi- 
sion, it is a common practice to directly display a video 
sent from a broadcast station. In other words, it is not a 
common practice to erase an object in a video or to 
change the object size so as to effectively change the 
display pattern (layout). Such a function of effectively 
changing the display layout is mandatory since a new 
function of an effective display method must be added 
as the numbers of channels and programs increase up- 



on development of digital TV broadcast. 
[0007] For example, the user wants to set a layout in 
the following situation. That is, live programs of baseball 
games have different display layouts depending on 

s broadcast stations although they belong to an identical 
category. For this reason, in order to display an object 
such as a score indication or the like in a common layout 
independently of broadcast stations, it is desirable to be 
able to set a layout the user wants. 

10 [0008] Furthermore, the user also wants to set a lay- 
out in the following situation. For example, the user may 
want to display necessary information in an enlarged 
scale or to quit display of unnecessary information in ac- 
cordance with the days of week or time band. However, 

is neither of such layout setups are possible in the status 
quo. 

[0009] One aspect of the present invention provides 
a novel reproducing function of image information and/ 
or sound information in digital TV broadcast. 
20 [0010] A preferred embodiment of the present inven- 
tion comprises a receiving apparatus capable of repro- 
ducing image data and/or sound data, comprising: re- 
ception means for receiving information consisting of 
image data, sound data, and additional system data; re- 
2S producing means for reproducing received image and 
sound data on the basis of the system data; and setting 
means for setting reproduction patterns in units of ob- 
jects when the received image data has a data format 
segmented in units of objects. 
30 [0011] Also, a preferred embodiment of the present 
invention provides a computer program product com- 
prising a computer readable medium having a computer 
program code, for a method of receiving information, 
and reproducing image data and/or sound data, the 
35 product comprising: a receiving process procedure 
code for receiving information consisting of image data, 
sound data, and additional system data; a reproducing 
process procedure code for reproducing received image 
and sound data on the basis of the system data; and a 
40 setting process procedure code for setting reproduction 
patterns in units of objects when the received image da- 
ta has a data format segmented in units of objects. 
[001 2] Embodiments of the present invention will now 
be described with reference to the accompanying draw- 
ls ings, in which: 

Fig. 1 is a block diagram showing the arrangement 
of a digital broadcast receiver using satellite broad- 
cast; 

so Fig. 2 is a block diagram showing the arrangement 
that simultaneously receives and encodes a plural- 
ity of kinds of objects; 

Fig. 3 is a view showing the arrangement of a sys- 
tem that takes user operation (edit) into considera- 
55 tion; 

Fig. 4 is a block diagram of a VOP processor that 
pertains to a video object on the encoder side; 
Fig. 5 is a block diagram of a VOP processor that 
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pertains to a video object on the decoder side; 
Fig. 6 is a block diagram showing the overall ar- 
rangement for encoding and decoding a VOP; 
Figs. 7A and 7B show information forming a VOP; 
Fig. 8 is a view for explaining AC/DC predictive cod- 5 
ing in texture coding; 

Figs. 9A and 9B are views for explaining the hierar- 
chical structure of a syntax that implements scala- 
bility; 

Fig. 1 0A is a view for explaining warp; io 
Fig. 1 0B is a table for explaining different types of 
warp; 

Fig. 11 is a view for explaining warp; 
Fig. 12 is a view showing an example of the format 
of scene description information; is 
Fig. 1 3 is a table showing different types of MPEG 
4 audio coding schemes; 

Fig. 1 4 is a diagram showing the arrangement of an 
audio coding scheme; 

Fig. 1 5 is a view for explaining the MPEG 4 system 20 
structure; 

Fig. 16 is a view for explaining the MPEG 4 layer 
structure; 

Fig. 1 7 is a view for explaining reversible decoding; 
Fig. 18 is a view for explaining multiple transmis- 25 
sions of important information; 
Fig. 1 9 is a block diagram showing the arrangement 
of a TV broadcast receiving apparatus according to 
the first embodiment of the present invention; 
Fig. 20 is a diagram for explaining a method of set- 30 
ting position data upon setting a layout; 
Fig. 21 is a view for explaining a method of inputting 
an image and instruction upon setting a layout; 
Fig. 22 is a view for explaining the format of layout 
setting data; 35 
Fig. 23 is a view showing an example of a video 
display layout according to the first embodiment; 
Fig. 24 shows the format of a general MPEG 4 bit- 
stream; 

Fig. 25 is a flow chart for explaining the operation 40 
sequence of the TV broadcast receiving apparatus 
of the first embodiment; 

Fig. 26 is a block diagram showing the arrangement 
of an encoding unit mounted in an MPEG 4 TV 
broadcasting system; 45 
Fig. 27 is a block diagram showing the arrangement 
of a decoding unit mounted in the TV broadcast re- 
ceiving apparatus; 

Fig. 28 is a view showing an example of an MPEG 
4 bitstream containing an MPEG 2 image; so 
Fig. 29 is a view showing the format of time data 
and its setting data upon setting display of a time 
indication image in more detail; 
Fig. 30 is a block diagram showing the arrangement 
of a TV broadcast receiving apparatus according to ss 
the third embodiment of the present invention; 
Figs. 31 to 34 shows video display layout examples 
according to the third embodiment; 



Figs. 35 and 36 are flow charts for explaining the 
operation sequence of the TV broadcast receiving 
apparatus according to the third embodiment; 
Fig. 37 is a block diagram showing the arrangement 
of a TV broadcast receiving apparatus according to 
the fifth embodiment of the present invention; 
Fig. 38 is a diagram for explaining output control of 
a sound object in accordance with layout setting da- 
ta; 

Fig. 39 is a view for supplementarily explaining a 
sound image and sound field lateralization; 
Fig. 40 shows the format of a general MPEG 4 bit- 
stream; 

Figs. 41 and 42 show video display layout examples 
according to the fifth embodiment; 
Fig. 43 shows the concept of the code format of ob- 
ject information; 

Fig. 44 shows the concept of the structure of layout 
setting data; 

Figs. 45 and 46 are flow charts for explaining the 
operation sequence of the TV broadcast receiving 
apparatus of the fifth embodiment; and 
Fig. 47 is a view for explaining a method of multi- 
plexing an MPEG 4 datastream on an MPEG 2 da- 
tastream. 

Outline 

[001 3] This embodiment allows movement and defor- 
mation of an image in units of objects by exploiting the 
concept of objects as characteristic features of Motion 
Picture Experts Group layer 4 (MPEG 4) coding. Objects 
include a background image, talking person, voice as- 
sociated with this person, and the like, and MPEG 4 cod- 
ing encodes/decodes individual objects and combines 
these objects to express one scene. 
[0014] A display function of this embodiment can ma- 
nipulate images to be displayed in units of objects in as- 
sociation with display of real-time image information in 
a broadcast system using MPEG 4. Furthermore, the 
display function of this embodiment can upscale/down - 
scale the individual objects from a predetermined size, 
and can move them from a predetermined position. TV 
broadcast includes a program as TV information, and 
unique ID information specified for each program, and 
a reproduction (display) layout which is arbitrarily set 
can be set and updated in correspondence with each ID 
information. 

[0015] According to this embodiment, the viewer of 
digital TV broadcast can set an arbitrary layout, i.e., can 
set the individual objects at desired positions to have 
desired sizes, thus improving the visual effect for the us- 
er and the quality of the user interface. 
[0016] The arrangement of a receiving apparatus that 
receives digital TV broadcast using MPEG 4 coding will 
be exemplified below as a receiving apparatus accord- 
ing to an embodiment of the present invention. Tech- 
niques that pertain to MPEG 4 wilt be explained in detail 
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below in units of fields. 

Outline of MPEG 4 

[Overall Configuration of Standards] 

[0017] The MPEG 4 standards consist of four major 
items. Three out of these items are similar to those of 
Motion Picture Experts Group layer 2 (MPEG 2), i.e., 
visual part, audio part, and system part. 

•Visual Part 

[001 8] This part specifies object coding that process- 
es a photo image, synthetic image, moving image, still 
image, and the like as standards. Also, this part includes 
a coding scheme, sync reproducing function, and hier- 
archical coding, which are suitable for correction or re- 
covery of transmission path errors. Note that Video" 
means a photo image, and "visual" includes a synthetic 
image. 

•Audio Part 

[0019] This part specifies object coding for natural 
sound, synthetic sound, effect sound, and the like as 
standards. The video and audio parts specify a plurality 
of coding schemes, and coding efficiency is improved 
by appropriately selecting a compression scheme suit- 
able for the feature of each object. 

•System Part 



[0020] This part specifies multiplexing of encoded vid- 
eo and sound objects, and their demultiplexing. 35 
*[0ffiSl^ 

adjustment functions of buffer memories and time bas- 
es. Video and sound objects encoded in the visual and 
audio parts are combined into a multiplexed stream of 
the system part together with scene configuration infor- 40 
mation that describes the positions, appearance and 
disappearance times of objects in a scene. As a decod- 
ing process, the individual objects are demultiplexed/ 
decoded from a received bitstream, and a scene is re- 
constructed on the basis of the scene configuration in- 45 
formation. 

[Object coding] 

[0022] In MPEG 2, coding is done in units of frames so 
or fields. However, in order to re-use or edit contents, 
MPEG 4 processes video and audio data as objects. 
The objects include: 

sound 55 
photo image (background image: two-dimensional 
still image) 

photo image (principal object image: without back- 



10 



15 



20 



25 



30 



ground) 

synthetic image 
character image 

[0023] Fig. 2 shows the system arrangement upon si- 
multaneously receiving and encoding these objects. A 
sound object encoder 5001 , photo image object encod- 
er 5002, synthetic image object encoder 5003, and char- 
acter object encoder 5004 respectively encode objects. 
Simultaneously with such encoding, scene configura- 
tion information that describes relations of the individual 
objects in a scene is encoded by a scene description 
information encoder 5005. The encoded object informa- 
tion and scene description information undergo an en- 
code process to an MPEG 4 bitstream by a data multi- 
plexer 5006. 

[0024] In this manner, the encode side defines a plu- 
rality of combinations of visual and audio objects to ex- 
press a single scene (frame). As for visual objects, a 
scene that combines a photo image and a synthetic im- 
age such as computer graphics or the like can be syn- 
thesized. With the aforementioned configuration, using, 
e.g., a text-to-speech synthesis function, an object im- 
age and its audio data can be synchronously repro- 
duced. Note that the bitstream is transmitted/received 
or recorded/reproduced. 

[0025] A decode p rocess is a process opposite to the 
aforementioned encode process. A data demultiplexer 
5007 demultiplexes the MPEG 4 bitstream into objects, 
and distributes the objects. The demultiplexed sound, 
photo image, synthetic image, character objects, and 
the like are decoded into object data by corresponding 
decoders 5008 to 5011. Also, the scene description in- 
formation is simultaneously decoded by a decoder 
501 2. A scene synthesizer 501 3 synthesizes an original 
*s?eWeWin^t^^ 

[0026] On the decode side, the positions of visual ob- 
jects contained in a scene, the order of audio objects, 
and the like can be partially changed. The object posi- 
tion can be changed by, e.g., dragging a mouse, and the 
language can be changed when the user changes an 
audio object. 

[0027] In order to synthesize a scene by freely com- 
bining a plurality of objects, the following four items are 
specified: 

• Object Coding 

Visual objects, audio objects, and AV (audiovis- 
ual) objects as their combination are to be encoded. 

• Scene Synthesis 

In order to specify scene configuration informa- 
tion and a synthesis scheme that synthesize a de- 
sired scene by combining visual, audio and AV ob- 
jects, a language obtained by modifying Virtual Re- 
ality Modeling Language (VRML) is used. 

• Multiplexing and Synchronization 

The format and the like of a stream (elementary 
stream) that multiplexes and synthesizes the indi- 
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vidual objects and the like are specified. The QOS 
(Quality of Service) upon delivering this stream onto 
a network or storing it in a recording apparatus can 
also be set. QOS parameters include transmission 
path conditions such as a maximum bit rate, bit error s 
rate, transmission scheme, and the like, decoding 
capability, and the like. 
User Operation (Interaction) 

A scheme for synthesizing visual and audio ob- 
jects on the user terminal side is defined. The 10 
MPEG 4 user terminal demultiplexes data sent from 
a network or a recording apparatus into elementary 
streams, and decodes them in units of objects. Also, 
the terminal reconstructs a scene from a plurality of 
encoded data on the basis of scene configuration is 
information sent at the same time. 

Fig. 3 shows the arrangement of a system that 
takes user operation (edit) into consideration. Fig. 

4 is a block diagram of a VOP processor that per- 
tains to a video object on the encoder side, and Fig. 20 

5 is a block diagram on the decoder side. 

Upon encoding a video in MPEG 4, a video ob- 
ject to be encoded is separated into its shape and 
texture. This unit video data is called a video object 
plane (VOP). Fig. 6 is a block diagram showing the 2$ 
overall arrangement for encoding and decoding a 
VOP 

For example, when an image is composed of 
two objects, i.e., a person and background, each 
frame is segmented into two VOPs which are en- 30 
coded. Each VOP is formed by shape information, 
motion information, and texture information of an 
object, as shown in Fig. 7A. On the other hand, a 
decoder demultiplexes a bitstream into VOPs, de- 
codes the individual VOPs, and synthesizes them 35 
to form a scene. 

In this manner, since the VOP structure is 
adopted, when a scene to be processed is com- 
posed of a plurality of video objects, they can be 
segmented into a plurality of VOPs, and those 40 
VOPs can be individually encoded/decoded. When 
the number of VOPs is 1 , and an object shape is a 
rectangle, conventional frame unit coding is done, 
as shown in Fig. 7B. 

VOPs include those coded by three different 4S 
types of predictive coding, i.e., an intra coded VOP 
(l-VOP), a forward predicted VOP (P-VOP), and a 
bi-directionally predicted (B-VOP). The prediction 
unit is a 16 x 16 pixel macroblock (MB). 

Bi-directional predictive coding (B-VOP) is a so 
scheme for predicting a VOP from both past and 
future VOPs like in B-picture of MPEG1 and MPEG 
2. Four different modes, i.e., direct coding, forward 
coding, backward coding, and bi-directional coding 
can be selected in units of macrob locks. This mode ss 
can be switched in units of MBs or blocks. Bi-direc- 
tional prediction is implemented by scaling the mo- 
tion vectors of P-VOPs. 



[Shape Coding] 

[0028] In order to handle an image in units of objects, 
the shape of the object must be known upon encoding 
and decoding. In order to express an object such as 
glass through which an object located behind it is seen, 
information that represents transparency of an object is 
required. A combination of the shape information and 
transparency information of the object will be referred to 
as shape information hereinafter. Coding of the shape 
information will be referred to as shape coding herein- 
after. 

[Size Conversion Process] 

[0029] Binary shape coding is a scheme for coding a 
boundary pixel by checking if each pixel is located out- 
side or inside an object. Hence, as the number of pixels 
to be encoded is smaller, the generated code amount 
can be smaller. However, reducing the macroblock size 
to be encoded means deteriorated original shape code 
at the receiving side. Hence, the degree of deterioration 
of original information is measured by size conversion, 
and as long as the size conversion error stays equal to 
or smaller than a predetermined threshold value, the 
smallest possible macroblock size is selected. As ex- 
amples of the size conversion ratio, an original size, 1/2 
(vertical and horizontal), and 1/4 (vertical and horizon- 
tal) are available. 

[0030] Shape information of each VOP is described 
by an 8-bit a value, which is defined as follows. 

a = 0: outside the VOP of interest 

a = 1 to 254: display in semi-transparent state to- 
gether with another VOP 

a = 255: display range of only the VOP of interest 
[0031 ] Binary shape coding is done when the a value 
assumes 0 or 255, and a shape is expressed by only 
the interior and exterior of the VOP of interest. Multi- 
valued shape coding is done when the a value can as- 
sume all values from 0 to 255, and a state wherein a 
plurality of VOPs are superposed on each other in a 
semi-transparent state can be expressed. 
[0032] As in texture coding, motion-compensated 
prediction with unit pixel precision is done in units of 16 
x 16 pixel blocks. Upon intra coding the entire object, 
shape information is not predicted. As a motion vector, 
the difference of a motion vector predicted from a neigh- 
boring block is used. The obtained difference value of 
the motion vector is encoded and multiplexed on a bit- 
stream. In MPEG 4, mot ion -compensated predicted 
shape information in units of blocks undergoes binary 
shape coding. 

•Feathering 

[0033] In addition, even in case of a binary shape,, 
when a boundary is to be smoothly changed from 
opaque to transparent, feathering (smoothing of a 



9 



EP 1 018 840 A2 



10 



boundary shape) is used. As feathering, a linear feath- 
ering mode for linearly interpolating a boundary value, 
and a feathering filter mode using a filter are available. 
For a mutti-valued shape with constant opacity, a con- 
stant alpha mode is available, and can be combined with 
feathering. 

[Texture Coding] 

[0034] Texture coding encodes the luminance and 
color difference components of an object, and process- 
es in the order of DCT (Discrete Cosine Transform), 
quantization, predictive coding, and variable-length 
coding in units of fields/frames. 
[0035] The DCT uses an 8 X 8 pixel block as a 
processing unit. When an object boundary is located 
within a block, pixels outside the object are padded by 
the average value of the object. After that, a 4-tap two- 
dimensional filter process is executed to prevent any 
large pseudo peaks from being generated in DCT coef- 
ficients. 

[0036] Quantization uses either an ITU-T recommen- 
dation H.263 quantizer or MPEG 2 quantizer. When the 
MPEG 2 quantizer is used, nonlinear quantization of DC 
components and frequency weighting of AC compo- 
nents can be implemented. 

[0037] Intra-coding coefficients after quantization un- 
dergo predictive coding between neighboring blocks be- 
fore variable-length coding to remove redundancy com- 
ponents. Especially, in MPEG 4, both DC and AC com- 
ponents undergo predictive coding. 
[0038] AC/DC predictive coding in texture coding 
checks the difference (gradient) between corresponding 
quantization coefficients between the block of interest 
and its neighboring block, and uses a smaller quantiza- 
tion coefficient in* prediction , as shown in 
[0039] Fig. 8. For example, upon predicting DC coef- 
ficient x of the block of interest, if corresponding DC co- 
efficients of the neighboring block are a, b, and c, the 
DC coefficient to be used in prediction is determined as 
per: 

if la - bl < lb - cl, DC coefficient c is used in pre- 
diction; or 

if la - bl 2 lb - cl, DC coefficient a is used in pre- 
diction. 

[0040] Upon predicting AC coefficient x of the block 
of interest as well, a coefficient to be used in prediction 
is selected in the same manner as described above, and 
is normalized by a quantization scale value QP of each 
block. 

[0041] Predictive coding of DC components checks 
the difference (vertical gradient) between DC compo- 
nents of the block of interest and its vertically neighbor- 
ing block and the difference (horizontal gradient) be- 
tween DC components of the block of interest and its 
horizontally neighboring block among neighboring 
blocks, and encodes the difference from the DC com- 
ponent of the block in a direction with a smaller gradient 



as a prediction error. 

[0042] Predictive coding of AC components uses cor- 
responding coefficients of neighboring blocks in corre- 
spondence with predictive coding of DC components. 
5 However, since quantization parameter values may be 
different among blocks, the difference is calculated after 
normalization (quantization step scaling). The pres- 
ence/absence of prediction can be selected in units of 
macroblocks. 

10 [0043] After that, AC components are zigzag- 
scanned, and undergo three-dimensional (Last, Run, 
and Level) variable-length coding. Note that Last is a 
1 -bit value indicating the end of coefficients other than 
zero, Run is a zero run length, and Level is a non-zero 

is coefficient value. 

[0044] Variable-length coding of DC components en- 
coded by intra coding uses either a DC component var- 
iable-length coding table or AC component variable- 
length coding table. 

20 

[Motion Compensation] 

[0045] In MPEG 4, a video object plane (VOP) having 
an arbitrary shape can be encoded. VOPs include those 

2$ coded by three different types of predictive coding, i.e., 
an intra coded VOP (l-VOP), a forward predicted VOP 
(P-VOP), and a bi-directionally predicted (B-VOP), as 
described above, and the prediction unit uses a mac- 
robtock of 16 lines X 16 pixels or 8 lines X 8 pixels. 

30 Hence, some macroblocks extend across the bounda- 
ries of VOPs. In order to improve the prediction efficien- 
cy at the VOP boundary, macroblocks on a boundary 
undergo padding and polygon matching (matching of 
only an object portion). 

35 

[Wavelet Coding] 

[0046] The wavelet transform is a transformation 
scheme that uses a plurality of functions obtained by up- 

40 scaling, down sea ling, and translating a single isolated 
wave function as transformation bases. A still image 
coding mode (Texture Coding Mode) using this wavelet 
transform is suitable as a high image quality coding 
scheme having various spatial resolutions ranging from 

45 high resolutions to low resolutions, when an image ob- 
tained by synthesizing a computer graphics (CG) image 
and natural image is to be processed. Since wavelet 
coding can simultaneously encode an image without 
segmenting it into blocks, block distortion can be pre- 

50 vented from being generated even at a low bit rate, and 
mosquito noise can be reduced. In this manner, the 
MPEG 4 still image coding mode can adjust the trade 
off among broad scalability from low-resolution, low- 
quality images to high-resolution, high-quality images, 

55 complexity of processes, and coding efficiency in corre- 
spondence with applications. 
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the like. 

• Static Sprite Coding 

Static sprite coding is a method of encoding the 
5 background (sprite) of the entire video clip in ad- 
vance, and expressing an image by geometric 
transformation of a portion of the background. The 
extracted partial image can express various defor- 
mations such as translation, upscaling/downscal- 
10 ing, rotation, and the like. As shown in Fig. 10A, 
viewpoint movement in a three-dimensional space 
expressed by movement, rotation, upscaling/down- 
scaling, deformation, or the like of an image is called 
"warp". 

15 There are four types of warp: perspective trans- 

formation, affine transformation, equidirectional up- 
scaling (a)/rotation (0)/movement (c, f), and trans- 
lation, which are respectively given by equations in 
Fig. 10B. Also, coefficients of equations shown in 

20 Fig. 10B define movement, rotation, upscaling/ 
downscaling, deformation, and the like. A sprite is 
generated offline before the beginning of coding. 

In this manner, static sprite coding is imple- 
mented by extracting a partial region of a back- 

25 ground image and warping the extracted region. A 
partial region included in a sprite (background) im- 
age shown in Fig. 11 is warped. For example, the 
. background image is an image of, e.g., a stand in a 
tennis match, and the region to be warped is an im- 

30 age including an object with motion such as a tennis 
player. In static sprite coding, only geometric trans- 
form parameters are encoded, but prediction errors 
are not encoded. 

• Dynamic Sprite Coding 

35 In static sprite coding, a sprite is generated be- 

fore coding. By contrast, in dynamic sprite coding, 
a sprite can be updated online during coding. Also, 
dynamic sprite coding encodes prediction errors 
unlike static sprite coding. 

40 • Global Motion Compensation (GMC) 

Global motion compensation is a technique for 
implementing motion compensation by expressing 
motion of the entire object by one motion vector 
without segmenting it into blocks, and is suitable for 

45 motion compensation of a rigid body. Also, a refer- 
ence image serves as an immediately preceding 
decoded image in place of a sprite, and prediction 
errors are coded like in static sprite coding. Howev- 
er, unlike static and dynamic sprite coding process- 

50 es, neither a memory for storing a sprite nor shape 
information are required. Global motion compensa- 
tion is effective for expressing motion of the entire 
frame and an image including zoom. 

55 [Scene Description Information] 



[Hierarchical Coding (Scalability)] 

[0047] In order to implement scalability, the hierarchi- 
cal structure of a syntax is constructed, as shown in 
Figs. 9A and 9B. Hierarchical coding is implemented by 
using, e.g., base layers as lower layers, and enhance- 
ment layers as upper layers, and coding "difference in- 
formation" that improves the image quality of a base lay- 
er in an enhancement layer. In case of spatial scalability, 
"base layer + enhancement layer" expresses a high-res- 
olution moving image. 

[0048] Furthermore, scalability has a function of hier- 
archically improving the image quality of the entire im- 
age, and improving the image quality of only an object 
region in the image. For example, in case of temporal 
scalability, a base layer is obtained by encoding the en- 
tire image at a low frame rate, and an enhancement lay- 
er is obtained by encoding data that improves the frame 
rate of a specific object in the image. 

•Temporal Scalability 

[0049] Temporal scalability shown in Fig. 9A specifies 
a hierarchy of frame rates, and can increase the frame 
rate of an object in an enhancement layer. The pres- 
ence/absence of hierarchy can be set in units of objects. 
There are two types of enhancement layers: type 1 is 
composed of a portion of an object in a base layer, and 
type 2 is composed of the same object as a base layer. 

•Spatial Scalability 

[0050] Spatial scalability shown in Fig. 9 B specifies a 
hierarchy of spatial resolutions. A base layer allows 
downsampling of an arbitrary size, and is used to predict 
an enhancement layer. 

[Sprite Coding] 

[0051] A sprite is a two-dimensional object such as a 
background image or the like in a three-dimensional 
spatial image, which allows the entire object to integrally 
express movement, rotation, deformation, and the like. 
A scheme for coding this two-dimensional object is 
called sprite coding. 

[0052] Sprite coding is classified into four types, i.e., 
static/dynamic and online/offline: a static sprite obtained 
by direct transformation of a template object by an ar- 
rangement that sends object data to a decoder in ad- 
vance and sends only global motion coefficients in real 
time; a dynamic sprite obtained by predictive coding 
from a temporally previous sprite; an offline sprite en- 
coded by intra coding (i-VOP) in advance and sent to 
the decoder side; and an online sprite simultaneously 
generated by an encoder and decoder during coding. 
[0053] Techniques that have been examined in asso- 
ciation with sprite coding include static sprite coding, dy- 
namic sprite coding, global motion compensation, and 



[0054] Objects are synthesized based on scene con- 
figuration information. In MPEG 4, configuration infor- 
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mation which is used to synthesize the individual objects 
into a scene is sent. Upon receiving the individually en- 
coded objects, they can be synthesized into a scene the 
transmitting side intended using the scene configuration 
information. 

[0055] The scene configuration information contains 
the display times and positions of the objects, which are 
described as nodes in a tree pattern. Each node has 
relative time information and relative spatial coordinate 
position information on the time base with respect to a 
parent node. As a language that describes the scene 
configuration information, BIFS (Binary Format for 
Scenes) obtained by modifying VRML, and AAVS 
(Adaptive Audio-Visual Session Format) using Java™ 
are available. BIFS is a binary description format of 
MPEG 4 scene configuration information. AAVS is de- 
veloped based on Java™, has a high degree of free- 
dom, and compensates for BIFS. Fig. 12 shows an ex- 
ample of the configuration of the scene description lan- 
guage. 

[Scene Description] 

[0056] Scene description uses BIFS. Note that a 
scene graph and node as concepts common to VRML 
and BIFS will be mainly explained below. 
[0057] A node designates grouping of lower nodes 
which have attributes such as a light source, shape, ma- 
terial, color, coordinates, and the like, and require coor- 
dinate transformation. By adopting the object-oriented 
concept, the location of each object in a three-dimen- 
sional space and the way its looks in that space are de- 
termined by tracing a tree called a scene graph from the 
top node and acquiring attributes of upper nodes. By 
synchronously assigning media objects, e.g., a MPEG 
4 video bitstream, to nodes as leaves of the tree, a mov- 
ing image or picture can be synthesized and displayed 
in a three-dimensional space together with other graph- 
ics data. 

[0058] Differences from VRML are as follows. The 
MPEG 4 system supports the following items in BIFS: 

(1) two-dimensional overlap relationship descrip- 
tion of MPEG 4 video VOP coding, and synthesis 
description of MPEG 4 audio; 

(2) sync process of continuous media stream; 

(3) dynamic behavior expression (e.g., sprite) of an 
object; 

(4) standardization of the transmission format (bi- 
nary); and 

(5) dynamic change of scene description in session. 

[0059] Almost all VRML nodes except for Extrusion, 
Script, Proto, and ExtemProto are supported by BIFS. 
New MPEG 4 special nodes added in BIFS are: 

(1 ) node for 2D/3D synthesis 

(2) node for 2D graphics and text 
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(3) animation node 

(4) audio node 

[0060] Note that VRML does not support 2D synthesis 
s except for a special node such as a background, but 
BIFS expands description to allow text/graphics overlay 
and MPEG 4 video VOP coding in units of pixels. 
[0061] In the animation node, a special node for an 
MPEG 4 CG image such as a face composed of 3D 
10 meshes is specified. A message (BIFS Update) that al- 
lows transposition, deletion, addition, and attribute 
change of nodes in the scene graph is prepared, so that 
a new moving image can be displayed or a button can 
be added on the screen during a session. BIFS can be 
is implemented by replacing reserved words, node identi- 
fiers, and attribute values of VRML by binary data in 
nearly one to one correspondence. 



[0062] Fig. 1 3 shows the types of MPEG 4 audio cod- 
ing schemes. Audio and sound coding schemes include 
parametric coding, CELP (Code Excited Linear Predic- 
tion) coding, and time/frequency conversion coding. 

25 Furthermore, an SNHC (Synthetic Natural Hybrid Cod- 
ing) audio function is adopted, which includes SA (Struc- 
tured Audio) coding and TTS (Text to Speech) coding. 
SA is a structural description language of synthetic mu- 
sic tones including MIDI (Music Instrument Digital Inter- 

30 face), and TTS is a protocol that sends intonation, pho- 
neme information, and the like to an external text-to- 
speech synthesis apparatus. 

[0063] Fig. 14 shows the arrangement of an audio 
coding system. Referring to Fig. 1 4, an input sound sig- 
35 nal is pre-processed (201), and is divided (202) in ac- 
cordance with the frequency band so as to selectively 
use three different coding schemes, i.e., parametric 
coding (204), CELP coding (205), and time/frequency 
conversion coding (206). The divided signal compo- 
se nents are input to suitable encoders. Signal analysis 
control (203) analyzes the input audio signal to generate 
control information and the like for assigning the input 
audio signal to the individual encoders. 
[0064] Subsequently, a parametric coding core (204), 
45 CELP coding core (205), and time/frequency conver- 
sion coding core (206) as independent encoders exe- 
cute encoding processes based on their own coding 
schemes. These three different coding schemes will be 
explained later Parametric- and CELP-coded audio da- 
50 ta undergo small-step enhancement (207), and time/fre- 
quency conversion-coded and small-step-enhanced 
audio data undergo large-step enhancement (208). 
Note that small-step enhancement (207) and large-step 
enhancement (208) are tools for reducing distortion pro- 
55 duced in the respective encoding processes. The large- 
step -enhanced audio data becomes an encoded sound 
bitstream. 

[0065] The arrangement of the sound coding system 
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shown in Fig. 14 has been explained. The respective 
coding schemes will be explained below with reference 
to Fig. 13. 

• Parametric Coding 

Parametric coding expresses a sound signal in- 
cluding an audio signal and music tone signal, by 
parameters such as frequency, amplitude, pitch, 
and the like, and encodes these parameters. Para- 
metric coding includes HVXC (Harmonic Vector Ex- 
citation Coding) for an audio signal, and IL (Individ- 
ual Line) coding for a music tone signal. 

HVXC coding mainly aims at audio coding 
ranging from 2 kbps to 4 kbps, classifies an audio 
signal into voiced and unvoiced tones, and encodes 
voiced tones by vector<; uantizing the harmonic 
structure of a residual signal of an LPC (Linear Pre- 
diction Coefficient). Also, HVXC coding directly en- 
codes unvoiced tones by vector excitation coding 
of a prediction residual. 

I L coding aims at coding of music tones ranging 
from 6 kbps to 16 kbps, and encodes a signal by 
modeling a signal by a line spectrum. 

• CE LP coding 

CELP coding is a scheme for encoding an input 
sound signal by separating it into spectrum enve- 
lope information and sound source information 
(prediction error). The spectrum envelope informa- 
tion is expressed by an LPC calculated from an in- 
put sound signal by linear prediction analysis. 
MPEG 4 CELP coding includes narrowband (NB) 
CELP having a bandwidth of 4 kHz, and wideband 
(WB) CELP having a bandwidth of 8 kHz. NB CELP 
can select a bit rate from 3.85 to 1 2.2 kbps, and WB 
CELP can select a bit rate from 13.7 to 24 kbps. 
•Time/Frequency Conversion Coding 

Time/frequency conversion coding is a coding 
scheme that aims at high sound quality. This coding 
includes a scheme complying with AAC (Advanced 
Audio Coding), and TwinVQ (Transform -domain 
Weighted Interleave Vector Quantization). This 
time/frequency conversion coding contains a psy- 
choacoustic model, and makes adaptive quantiza- 
tion exploiting an auditory masking effect. 

The scheme complying with AAC frequency- 
converts an audio signal by e.g., the DCT, and 
adaptively quantizes the converted signal exploiting 
an auditory masking effect. The adaptive bit rate 
ranges from 24 kbps to 64 kbps. 

The TwinVQ scheme smoothes an MDCT co- 
efficient of an audio signal using a spectrum enve- 
lope obtained by linear prediction analysis of an au- 
dio signal. After the smoothed signal is interleaved, 
it is vector-quantized using two code lengths. The 
adaptive bit rate ranges from 6 kbps to 40 kbps. 



[System Structure] 

[0066] The system part in MPEG 4 defines multiplex- 
ing, demultiplexing, and synthesis. The system struc- 

5 ture will be explained below with reference to Fig. 15. 
[0067] In multiplexing, each elementary stream in- 
cluding individual objects as outputs from video and au- 
dio encoders, scene configuration information that de- 
scribes the spatial layout of the individual objects, and 

10 the like is packetized by an access unit layer. The ac- 
cess unit layer appends, as a header, a time stamp, ref- 
erence clock, and the like for establishing synchroniza- 
tion for each access unit. Obtained packetized streams 
are multiplexed by a FlexMux layer in a unit that consid- 

15 ers a display unit and error robustness, and is sent to a 
Trans Mux layer. 

[0068] The Trans Mux layer appends an error correc- 
tion code in a protection sub layer in correspondence 
with the necessity of error robustness. Finally, a mult i- 

20 plex sub layer (Mux Sub Layer) outputs a single Trans- 
Mux stream onto a transmission path. The TransMux 
layer is not defined in MPEG 4, and can use existing 
network protocols such as UDP/IP (User Datagram Pro- 
tocol/Internet Protocol) as an Internet protocol, MPEG 

25 2 transport stream (TS), ATM (Asynchronous Transfer 
Mode) AAL2 (ATM Adaptation layer 2), videophone mul- 
tiplexing scheme (ITU-T recommendation H.223) using 
a telephone line, digital audio broadcast, and the like. 
[0069] In order to reduce the overhead of the system 

30 layer, and to allow easy embedding in a conventional 
transport stream, the access unit layer or FlexMux layer 
may be bypassed. 

[0070] On the decode side, in order to synchronize in- 
dividual objects, a buffer (DB: Decoding Buffer) is insert- 
35 ed after demultiplexing to absorb arrival and decoding 
time differences of the individual objects. Before synthe- 
sis, a buffer (CB: Composition Buffer) is also inserted to 
adjust the display timing. 

40 [Basic Structure of Video Stream] 

[0071] Fig. 16 shows the layer structure. Respective 
layers are called classes, and each class has a header. 
The header contains various kinds of code information, 
45 such as startcode, endcode, ID, shape, size, and the 
like. 

• Video Stream 

A video stream consists of a plurality of ses- 
so sions. A session means one complete sequence. 

A video session (VS) is formed by a plurality of 
video objects (VOs). 

Each video object (VO) consists of a plurality 
of video object layers (VOLs). 
55 Each video object layer (VOL) is a sequence 

including a plurality of layers in units of objects. 

A group of video object plane (GOV) consists 
of a plurality of VOPs. 
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Note that a plane indicates an object in units of 
frames. 

[Bitstrearn Structure Having Error Robustness] 

[0072] In MPEG 4, the coding scheme itself has resil- 
ience or robustness against transmission errors to 
achieve error-prone mobile communications (radio 
communications). Error correction in an existing stand- 
ard scheme is mainly done on the system (sender) side. 
However, in a network such as PHS (Personal Handy- 
phone System), the error rate is very high, and errors 
that cannot be corrected by the system may mix in a 
video encoded portion. In consideration of such errors, 
MPEG 4 assumes various error patterns that cannot be 
corrected by the system, and adopts an error robust 
coding scheme that can suppress propagation of errors 
as much as possible in such environment. An example 
of error robustness that pertains to image coding, and 
a bitstrearn structure therefor will be explained below. 

• Reversible VLC (RVLC) and Reversible Decoding 
As shown in Fig. 17, when an error is detected 
during decoding, the decoding process is paused 
there, and the next sync signal is detected. When 
the next sync signal has been detected, the bit- 
stream is decoded in an opposite direction from the 
detection position of the sync signal. The number 
of decoding start points is increased without any 
new additional information, and the decodable in- 
formation size upon production of errors can be in- 
creased compared to the conventional system. 
Such variable-length coding that can decode from 
both the forward and reverse directions implements 
"reversible decoding". 
**r^MIttiplelTira^ 

As shown in Fig. 18, a structure that can trans- 
mit important information a plurality of times is in- 
troduced to reinforce error robustness. For exam- 
ple, in order to display individual VOPs at correct 
timings, time stamps are required, and such infor- 
mation is contained in the first video packet. Even 
if this video packet is lost by errors, decoding can 
be restarted from the next video packet by the afore- 
mentioned reversible decoding structure. However, 
since this video packet contains no time stamp, the 
display timing cannot be detected after all. For this 
reason, a structure in which a flag called HEC 
(Header Extension Code) is set in each video pack- 
et, and important information such as a time stamp 
and the like can be appended after that flag is intro- 
duced. After the HEC flag, the time stamp and VOP 
coding mode type can be appended. 

If synchronization has an error, decoding is 
started from the next resynchronization marker 
(RM). In each video packet, information required for 
that process, i.e., the number of the first MB con- 
tained in that packet and the quantization step size 



for that MB, are set immediately after RM. The HEC 
flag is inserted after such information; when HEC = 
'1', TR and VCT are appended immediately there- 
after. With such HEC information, even when the 
5 first video packet cannot be decoded and is discard- 
ed, video packets starting from one set with HEC = 
'1 ■ can be normally decoded and displayed. Wheth- 
er or not HEC is set at '1' can be freely set on the 
encoder side. 
10 • Data Partitioning 

Since the encoder side forms a bitstrearn by re- 
peating encoding processes in units of MBs, if an 
error has corrupted a portion of the stream, MB data 
after the error cannot be decoded. On the other 
is hand, a plurality of pieces of MB information are 
classified into some groups, these groups are set in 
a bitstrearn, and marker information is inserted at 
the boundaries of groups. With this format, even 
when an error mixes in the bitstrearn and data after 
20 that error cannot be decoded, synchronization is es- 
tablished again using the marker inserted at the end 
of the group, and data in the next group can be nor- 
mally decoded. 

Based on the aforementioned concept, data 
25 partitioning that classifies motion vectors and tex- 
ture information (DCT coefficients and the like) in 
units of video packets is adopted. A motion marker 
(MM) is set at the boundaries of groups. 

Even when an error mixes in the middle of mo- 
30 tion vector information, the DCT coefficient after 
MM can be normally decoded. Hence, MB data cor- 
responding to a motion vector before mixing of the 
error can be accurately reconstructed as well as the 
DCT coefficient. Even when an error mixes in tex- 
35 ture information, an image which is accurate to 
*£©Tifiele!aeTi^ 
(concealment) using motion vector information and 
decoded previous frame information as long as the 
motion vector is normally decoded. 
40 • Variable-length Interval Synchronization Scheme 
A resynchronization scheme for variable-length 
packets will be explained below. An MB group con- 
taining a sync signal at the head of the group is 
called a "video packet", and the number of MBs con- 
45 tained in that packet can be freely set on the encod- 
er side. When an error mixes in a bitstrearn that us- 
es VLCs (Variable Length Codes), the subsequent 
codes cannot be synchronized and cannot be de- 
coded. Even in such case, by detecting the next re- 
so synchronization marker, the following information 
can be normally decoded. 

[Byte Alignment] 

55 [0073] In order to attain matching with the system, in- 
formation is multiplexed in units of integer multiples of 
bytes. A bitstrearn has a byte alignment structure. In or- 
der to achieve byte alignment, stuffing bits are inserted 



10 



19 



EP1 018 840 A2 



20 



at the end of each video packet. The stuffing bits are 
also used as an error check code in a video packet. 
[0074] The stuffing bits consist of a code like '011 1 1 ', 
i.e., the first bit = '0' and other bits = '1 '. More specifically, 
if MBs in a given video packets are normally decoded 5 
up to the last MB, a code that appears after that MB is 
always '0', and a run of '1 's having a length 1 bit shorter 
than that of the stuffing bits should appear after '0'. If a 
pattern that violates this rule is detected, this means that 
decoding before that pattern is not normal, and an error 
in a bit st ream can be detected. 
[0075] The MPEG 4 technology has been explained 
with reference to "Outline of MPEG 4 International 
Standards Determined", Nikkei Electronics, 1997.9.22 
issue, p. 147 - 168, "Full Story of Upcoming MPEG 4", 
The Institute of Image Information and Television Engi- 
neers Text, October 2, 1997, "Latest Standardization 
Trend of MPEG 4 and Image Compression Technique", 
Japan Industry Engineering Center Seminar Reference, 
February 3, 1997, and the like. 

First Embodiment 

[Arrangement] 

[0076] A TV broadcast receiving apparatus according 
to the first embodiment of the present invention will be 
described below with reference to the accompanying 
drawings. Fig. 1 9 is a block diagram showing the ar- 
rangement of a TV broadcast receiving apparatus of the 
first embodiment. 

[0077] A digital TV broadcast signal is tuned in and 
received depending on its broadcast pattern, e.g., by a 
satellite antenna 21 and tuner 23 in case of satellite 
broadcast or by a tuner 24 via a cable 22 in case of cable 
broadcast. TV information received from satellite or ca- 
ble broadcast is input to a data selector 43 to select one 
data sequence. The selected data sequence is demod- 
ulated by a demodulation circuit 25, and the demodulat- 
ed data undergoes error correction in an error correction 
circuit 26. 

[0078] Subsequently, the TV information is demulti- 
plexed by a multiplexed signal demultiplexing circuit 27 
into image data, sound data, and other system data (ad- 
ditional data). Of these data, sound data is decoded by 
a sound decoding circuit 28 to obtain stereo audio data 
A(L) and A(R), which are input to a sound controller 30 
to adjust the sound volume and sound field lateralization 
and to make a multi-sound channel process such as a 
main/sub sound channel. After that, the sound data to 
be output is selected, and is converted by a digital-an- 
alog converter (D/A) 29 into an analog signal. The ana- 
log signal is reproduced via a loudspeaker 31 . 
[0079] On the other hand, image data is decoded by 
an image decoding circuit 32 including a plurality of de- 
coders which make decoding processes in correspond- 
ence with individual objects in the image data. This de- 
coding scheme decodes in units of objects on the basis 



of the aforementioned MPEG 4 image coding scheme. 
Decoded image data are images v(1 ) to v(i) correspond- 
ing to the number of objects, which undergo various 
processes on the basis of display by a display controller 
34. 

[0080] Display control done by the display controller 
34 includes a process for determining whether or not 
each object is displayed, a process for upscaling/down- 
scaling each object, a process for determining the dis- 
play position of each object on the frame, and the like. 
Furthermore, the display control includes various dis- 
play processes such as synthesis of objects and char- 
acter images (time indication, index title, and the like) 
generated by a character generation circuit 40, and the 
like. Such display control processes are done under the 
control of a system controller 38 on the basis of layout 
information of individual objects, i.e., scene description 
information from a scene description data conversion 
circuit 39. 

[0081 ] The formed display image is converted into an 
analog signal by a D/A converter 33, and is displayed 
on a CRT 35, or is sent to and displayed on a liquid crys- 
tal display (LCD) 44 or the like as a digital signal. 
[0082] On the other hand, the system data (including 
additional data) is decoded by a system data decoding 
circuit 36. From the decoded system data, an ID detec- 
tor 37 detects a program ID appended to a program. The 
detected program I D is input to the system controller 38 
to serve as a dedicated command for program discrim- 
ination. Also, of the decoded system data, data that per- 
tains to scene description is input to the scene descrip- 
tion data conversion circuit 39. The remaining system 
data (including time data) are input as various com- 
mands to the system controller 38. Note that the addi- 
tional data may include a document or the like such as 
a title index of a program or the like. 
[0083] The display controller 34 sets a layout of the 
individual objects and the sound controller 30 sets the 
sound volume, sound field lateralization, and the like us- 
ing scene description data obtained by the scene de- 
scription data conversion circuit 39. By adjusting the 
scene description data conversion circuit 39 and con- 
trolling the display controller 34 under the control of the 
system controller 39, an arbitrary layout of individual ob- 
jects that the user desired and is different from a basic 
layout can be set. The layout setting method will be de- 
scribed later. 

[0084] When a display image which is not handled as 
an object, e.g., a time indication frame, title index, or the 
like, is generated, the character generation circuit 40 is 
used. Under the control of the system controller 38, time 
indication character data is generated on the basis of 
time data contained in the additional data, time informa- 
tion generated inside the receiver, or the like using a 
memory 42 such as a ROM or the like that saves char- 
acter data. The same applies to title index data. The 
generated image is synthesized with objects by the dis- 
play controller 34. 



15 



20 



25 



30 



35 



40 



45 



50 



11 



21 



EP 1 018 840 A2 



22 



[0085] The user can input various commands via an 
instruction input unit 45. As for objects for which a layout 
is to be changed based on the user instruction input, 
their positions, sizes, and the like are adjusted, and 
those objects are displayed in a layout that the user de- 
sired. That is, layout correction and input of new setting 
values are made via the instruction input unit 45. The 
system controller 38 appropriately controls the opera- 
tions of the respective units in accordance with input in- 
struction values to obtain a desired output (display, re- 
production) pattern. 

[Layout Setups] 

[0086] An example of the layout setting method will 
be explained below. Fig. 20 is a diagram for explaining 
the method of setting position data upon layout setups, 
and Fig. 21 is a view for explaining the method of input- 
ting an image and instruction upon layout setups. 
[0087] There are two methods of setting the position 
of an object. The first method shifts the position of a ba- 
sic layout specified by scene description data, and the 
second method allows the user to set a new object po- 
sition at an arbitrary location. One of these methods can 
be selected by a selector 302 shown in Fig. 20 in ac- 
cordance with user operation. 
[0088] The method of shifting the basic layout as the 
first method will be explained first. Image data is input 
as an object, and the basic position of that object is ex- 
pressed by position data (X0, Y0) designated by scene 
description data. When the user wants to shift that ob- 
ject, a correction amount (AX, AY) is added to the posi- 
tion data (X0, Y0) by an adder 301 , and new position 
data (X\ Y') is used as layout setting data of the object. 
[0089] The object size is adjustable by increasing/de- 

ject by a prescribed value (e.g., an integer) in the display 
controller 34. The object whose upscaling/downscaling 
factor has been arbitrarily changed is synthesized with 
a background image. When a given object is not dis- 
played, the object which is not to be displayed is proc- 
essed not to be synthesized with the display frame upon 
synthesizing objects. 

[0090] The method of setting the new object position 
as the second method will be described. A new object 
position (X, Y) is set independently of basic position da- 
ta, and is used as position data (X, Y 1 ) that replaces the 
basic position data. In this manner, an object is moved. 
[0091] The system controller 38 controls the display 
controller 34 to implement a process for determining 
whether or not a given object is synthesized with the dis- 
play frame (to turn on/off object display) and a process 
for upscaling/downscaling a given object by interpolat- 
ing/decimating pixels. The control data used at that time 
is held as layout setting data. 

[0092] As for sound data, the system controller 38 
controls the scene description data conversion circuit 39 
to adjust or change scene description data for sound, 
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so as to obtain an audio output that the user desired. 
Such data is called a sound layout, and control data at 
that time is called layout setting data for sound. 
[0093] Fig. 21 depicts the aforementioned position 
setting methods. On a display device 303 such as a CRT 
or the like, when an object 306 located at a basic position 
(X0, Y0) is shifted to a shift position 307, layout setting 
data (X, Y') as final position data obtained by adding the 
shift amount to the basic position data is (X0+AX, 
Y0+AY). On the other hand, when the user arbitrarily 
lays out an object at a new setting position 308, the lay- 
out setting data (X 1 , Y') is (X, Y). 
[0094] Fig. 21 illustrates a mouse 304 and remote 
controller 305 as examples of pointing devices included 
in the instruction input unit 45. Using the mouse 304 or 
direction input keys (or a cross-cursor key, joystick, joy- 
pad, or the like) of the remote controller 305, movement 
of a given object can be freely and easily implemented. 
Note that the shift position or new position of a given 
object may be selected from some preset positions such 
as the four corners and center of the frame. 
[0095] TV broadcast data includes a program ID. Us- 
ing such program IDs, the set layouts are converted into 
data in correspondence with program IDs in units of pro- 
grams, and the converted data may be stored as layout 
setting data. As the storage location of layout setting da- 
ta, a nonvolatile memory 41 such as an EEPROM or the 
like is used. Upon detecting a program ID stored in the 
memory 41 from TV broadcast data, the system control- 
ler 38 controls the scene description data conversion cir- 
cuit 39 and display controller 34 on the basis of the lay- 
out setting data corresponding to the detected program 
ID to make image display and sound reproduction in a 
layout set by the user. 

[0096] Subsequently, layout setting data will be ex- 
^la*ir?elfffi3^^ 

formation (e.g., display ON/OFF or new position) of an 
object upon layout setups by the user are converted into 
data tn addition to object layout information on the basis 
of the object layout information obtained from scene de- 
scription data, and the converted data can be stored as 
layout setting data. As has already been described pre- 
viously with reference to Fig. 12, the scene description 
data is information for laying out objects that form each 
scene in a tree pattern, and designating the display 
times and positions of the individual objects. 
[0097] As another format of layout setting data, as 
shown in Fig. 22, when ON/OFF data indicating whether 
or not the object of interest is displayed, display position 
data obtained when the display position is two-dimen- 
sionally expressed by the x- and y-axes, and data indi- 
cating the size are held, they can be used as layout set- 
ting data. 

[0098] Fig. 23 shows an example of a video display 
layout according to this embodiment. 
[0099] When a video signal sent from the broadcast 
station is normally displayed without any changes, a ba- 
sic image 1 06 shown in Fig. 23 is displayed. In this case, 
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the basic image 1 06 consists of an entire image (back- 
ground: sprite) 101, spot relay image 102, time indica- 
tion image 103, weather forecast image 104, and sound 
object. In the display example shown in Fig. 23, the time 
indication image 103 is contained in image data as an 5 
object. 

[0100] Fig. 24 shows the format of a general MPEG 
4 bitstream. Objects contained in the display example 
in Fig. 23 are multiplexed in a database of objects 1 to 
4 in Fig. 24. Objects 1 to 5 respectively correspond to 
the entire image 101 , spot relay image 102, time indica- 
tion image 103, weather forecast image 104, and sound 
data, and additional data containing scene description 
information, program ID, and the like are multiplexed as 
system data, thus forming a bitstream. 
[0101] Using this embodiment, the entire image 101 
can be downscaled, the relay image 102 can be up- 
scaled, and the time indication image 103 and weather 
forecast image 104 can be moved, as shown in a setting 
example 107 in Fig. 23. Also, the time indication image 
103 can be upscaled, as shown in a setting example 
1 08. Such setups can be freely made in units of program 
IDs. After such display layout is set, when the corre- 
sponding program ID is detected, the stored setup in- 
formation is read out from the memory 41 , and the video 
data of that program is displayed in the set layout. 

[Operation Sequence] 

[0102] Fig. 25 is a flow chart showing the operation 
sequence of the TV broadcast receiving apparatus of 
this embodiment. The operation sequence shown in Fig. 
25 is implemented by executing a program stored in the 
memory 41 or 42 by the system controller 38. Note that 
the program may be pre-stored in the memory 41 or 42. 
Also, the program downloaded via a satellite or cable 
broadcast channel may be stored in the memory 41 or 
42. 

[01 03] TV information is received (step S1 ), and a pro- 
gram ID is detected from additional data appended to 
the TV information (step S2). As for the program ID, 
those different in units of programs are appended by the 
broadcast station, and each program ID is multiplexed 
on TV information together with other additional data. 
Based on the detected program ID, it is checked if layout 
setting data corresponding to that program ID is stored 
(step S3). 

[0104] If layout setting data is saved in correspond- 
ence with the program ID, that layout setting data is read 
out from the memory 41 (step S4), and the user is in- 
quired as to whether or not video display based on the 
saved layout setting data is to be made (step S5). If such 
video display is permitted, video data of the program is 
displayed in the set layout (step S6). 
[01 05] If no layout setting data is saved in correspond- 
ence with the program ID, and if the user rejects video 
display based on the saved layout setting data, it is 
checked if a new layout is set for that program (step S7). 



If a new layout is not set or need not be set, video data 
of the program is displayed in a basic layout as it is sent 
from the broadcast station (step S8). 
[01 06] If a new layout is set, the control enters the lay- 
out setting mode (step S9). Then, the user selects an 
object for which a layout is to be adjusted, audio output 
format, or the like from objects that form image data in 
the TV information (step S10), and makes adjustment 
that pertains to a layout such as movement, upscaling/ 
downscaling, display ON/OFF, and the like of the select- 
ed object, or adjusts the audio output format such as the 
sound volume, sound field lateralization, or the like (step 
S11). 

[0107] Upon completion of adjustment for the select- 
ed object, the user decides if layout setups are to end 
(step S12). If the user wants to adjust another object, 
the flow returns to step S10 to repeat selection and ad- 
justment of an object. If the user wants to quit setups, 
layout setting data is stored in the memory 41 in corre- 
spondence with the program ID upon completion of the 
setting mode (step S13). Video data of the program is 
displayed in the newly set layout (step S6). 
[0108] The TV broadcast receiving apparatus of this 
embodiment displays video data of a TV program in the 
aforementioned sequence, and repeats the sequence 
shown in Fig. 25 every time a new program ID is detect- 
ed. 

[01 09] As described above, according to this embod- 
iment, the user who watches digital TV broadcast can 
adjust (also can turn on/off) the layout of each object, 
and the sound volume and sound field lateralization of 
audio data, and can set an arbitrary layout in corre- 
spondence with video display of a program. Hence, vid- 
eo display and sound reproduction according to user's 
favor can be made, the quality of the audiovisual user 
interface can be improved, and more flexible TV pro- 
gram display can be presented to the user. 
[01 1 0] Layouts are set in units of I Ds appended to pro- 
grams, and layout setting data are stored in correspond- 
ence with IDs. Hence, once the layout is set, video dis- 
play of a given program can be automatically made in 
the set layout by recognizing the layout setting data cor- 
responding to the ID of that program, thus very effec- 
tively adding a new function to TV broadcast display. 
[0111] The layout setting data is not always set by the 
user. For example, the user may select optional layout 
setting data, which is sent together with TV information, 
to adjust video display, and the selected optional layout 
setting data may be stored in the memory 41 . 

Second Embodiment 

[011 2] A TV broadcast receiving apparatus according 
to the second embodiment of the present invention will 
be described below. Note that the same reference nu- 
merals in the second embodiment denote the same 
parts as those in the first embodiment, and a detailed 
description thereof will be omitted. 
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[0113] In TV broadcast that uses an image encoded 
by a coding scheme other than MPEG 4 as one MPEG 
4 object, the second embodiment makes video display 
of a TV program with high degree of freedom in layout 
using a layout (movement, upscaling/downscaling, and 5 
the like of an object) set by the user 
[0114] A case will be exemplified below wherein 
MPEG 2 is used as a photo image coding scheme. That 
is, a TV broadcast receiving apparatus which receives 
and displays an image encoded by MPEG 2 (to be also io 
referred to as an "MPEG 2 image" hereinafter) multi- 
plexed on an MPEG 4 bitstream will be explained below. 
Note that the layout setting method in the second em- 
bodiment is the same as that described in the first em- 
bodiment, and the basic arrangement and operation of *5 
the TV broadcast receiving apparatus are the same as 
those described above using Fig. 19. In the second em- 
bodiment, however, since details of the sound decoding 
circuit 28, image decoding circuit 32, and system data 
decoding circuit 36 in Fig. 1 9 are different in terms of the 20 
TV broadcast decoding method in the second embodi- 
ment, they will be explained using Figs. 26 and 27. 
[0115] Fig. 26 shows an encoding unit used in a sys- 
tem for receiving MPEG 4 TV broadcast in a broadcast 
station as the sender side. A data multiplexer 5006 mul- 25 
tiplexes the outputs from sound object, photo image ob- 
ject, synthetic image object, character object, and scene 
description information encoders 5001 to 5005, that 
have been explained previously using Fig. 2, into an 
MPEG 4 bitstream, and also multiplexes an MPEG 2 bit- 30 
stream 61 extracted by an MPEG 2 commercial broad- 
cast equipment or relay system or upon reproducing a 
DVD (Digital Video Disc) into the MPEG 4 bitstream. 
[0116] Fig. 27 shows the arrangement of a decoding 
unit used in the MPEG 4 bitstream decoding side, i.e., 35 

unit shown in Fig. 27 is included in the sound decoding 
circuit 28, image decoding circuit 32, system data de- 
coding circuit 36, scene description data conversion cir- 
cuit 39, and the like, which are decoding systems and 40 
their associated circuit that construct the TV broadcast 
receiving apparatus of the second embodiment. 
[0117] The received MPEG 4 bitstream is demulti- 
plexed by a data demultiplexer 5007 into individual data 
before decoding. Of the demultiplexed data, the sound *s 
object, photo image object, synthetic image object, 
character object, and scene description information as 
MPEG 4 objects are decoded by corresponding decod- 
ers 5008 to 5012. Also, MPEG 2 data multiplexed to- 
gether with the MPEG 4 objects is decoded by a dedi- so 
cated MPEG 2 decoder 62 provided independently of 
those for the MPEG 4 objects. Note that the MPEG 2 
decoder 62 may use some components of the MPEG 4 
image decoding circuit 32. 

[0118] Information for displaying video data of a TV ss 
program is formed based on the decoded sound and im- 
age data, and scene description data as system data, 
and the individual objects and MPEG 2 data are synthe- 



sized by a scene synthesizer 5013 into a scene to be 
output to the TV, thus outputting scene information, 
[0119] A case will be explained below using Fig. 23 
wherein video data of MPEG 4 TV broadcast containing 
an MPEG 2 image is displayed using the layout setting 
method described in the first embodiment. In the second 
embodiment, assume that the spot relay image 102 
shown in Fig. 23 is an MPEG 2 image. That is, Fig. 23 
shows a video display example of MPEG 4 TV broad- 
cast containing an MPEG 2 image. Fig. 28 shows an 
example of an MPEG 4 bitstream at that time. 
[0120] The MPEG 4 bitstream shown in Fig. 28 is mul- 
tiplexed with data (an MPEG 2 datastream) of the spot 
relay image 102 as object 2. The MPEG 2 datastream 
normally consists of three types of data, i.e., audio data, 
video data, and system data (MPEG 2 additional infor- 
mation). In object 2, the MPEG 2 datastream segments 
each having a predetermined size are multiplexed in ac- 
cordance with predetermined timing adjustment that 
pertains to transmission. Since some MPEG 4 encod- 
ing/decoding circuits have downward compatibility to 
MPEG 2, common circuits are used if necessary so as 
to avoid wasteful use of resources that pertain to encod- 
ing/decoding. 

[0121] In this manner, a layout can be set even for 
MPEG 4 TV broadcast containing image and sound data 
encoded by MPEG 2, as has been described in the first 
embodiment. 

[0122] Also, the time indication image 103 shown in 
Fig. 23 may be the one generated by the TV broadcast 
receiving apparatus. In such case, the character gener- 
ation circuit 40 can generate the time indication image 
103 using time data serving as basis of time indication, 
which is sent from the sender side such as the broadcast 
station or clo ck sign als in the TV broad cast receiv ing 
'apparattfsWhW 

data, and the time indication image 103 is generated us- 
ing this time data. Furthermore, when the additional data 
includes a time indication command that instructs time 
indication using internal clocks of the TV broadcast re- 
ceiving apparatus, or when the system controller 38 has 
issued a unique time indication command, the time in- 
dication image 103 is generated based on the internal 
clocks. Note that the character generation circuit 40 and 
the memory 42 that stores character data actually gen- 
erate the time indication image 103, i.e., have a role of 
character generation, the display controller 34 synthe- 
sizes the generated images, and the system controller 
38 controls them to generate and display the time indi- 
cation image 103. 

[0123] Note that the same operation can be imple- 
mented using time stamp data contained as one infor- 
mation in a sub code of the MPEG 2 datastream. 
[0124] Furthermore, since a relatively simple image 
such as the weather forecast image 104 shown in Fig. 
23 can be displayed using CG data, the sender side 
need only send a command indicating an object to be 
displayed, and the receiver side makes operations for 
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generating character and CG data so as to appropriately 
generate and display a weather forecast image or the 
like. In this manner, the load on transmission (commu- 
nication) can be reduced, and the transmission efficien- 
cy can be improved. 

[0125] Of course, according to this embodiment, a 
display image generated by character & CG generation 
can be handled in the same manner as other objects, 
and can be freely laid out. 

[0126] As for layout setups of a display image, when 
the size of an object is adjusted or the sound volume, 
sound field lateralization, or the like is set by converting 
into data the shift amount or change point of an object 
for which layout has been changed by the user on the 
basis of object layout information basically obtained 
from scene description information, position data or con- 
trol data of each unit used at that time is stored as layout 
setting data in the same manner as in the first embodi- 
ment described with reference to Figs. 20 and 21 . 
[0127] The format of the layout setting data has al- 
ready been explained using Fig. 22. Fig. 29 shows an 
example of the format of time data and its setup data 
used when display of the time indication image 103 is 
set in more detail. 

[0128] The time data shown in Fig. 29 has ON/OFF 
flags indicating display/non-display of display contents, 
i.e., (dummy), year, month, day, hour, minute, second, 
and display frame (number). In this manner, time infor- 
mation to be displayed on a given image can be set in 
detail. Furthermore, by holding display data upon two- 
dimensionally expressing display position by the x- and 
y-axes, and data indicating size, the time data can be 
used as layout setting data. As additional data, unique 
data to be added as character options such as a font, 
style, color, alignment, and the like may be held in terms 
of display of character information such as time or the 
like. 

[0129] Since the second embodiment is applied to 
MPEG 4 TV broadcast multiplexed with an MPEG 2 im- 
age, when the system of the second embodiment is 
combined with an image relay system used to relay 
MPEG 2 contents, e.g., a live image from a given spot, 
the output from an MPEG 2 device can be used in the 
MPEG 4 broadcast system without requiring complicat- 
ed data conversion, and such system is easy to use due 
to affinity between MPEG 2 and MPEG 4. The present 
invention can be applied not only to a relay image but 
also to a multiplexed image output example such as ref- 
erence video display using a DVD as a typical MPEG 2 
video device or an example using another MPEG 2 de- 
vice. 

[0130] Since there are a large number of encoding/ 
decoding circuits that can be commonly used for MPEG 
2 and MPEG 4, no complicated circuit arrangement is 
required in addition to high system efficiency. Of course, 
the system efficiency can be improved even in case of 
a software decoder. In the second embodiment, an 
MPEG 2 datastream is multiplexed as one MPEG 4 ob- 



ject. Also, when layout information is multiplexed as ad- 
ditional data in MPEG 2 system data, the same effect 
can be provided. 

[01 31] Furthermore, according to the second embod- 
iment, since TV information encoded by MPEG 2 can 
also be used in an MPEG 4 TV system in addition to the 
effect of the first embodiment, existing contents can be 
directly used, and MPEG 2 data need not be converted 
into MPEG 4 data, thus providing a very effective system 
which is easy to use. 

[0132] In this manner, digital TV broadcast can be 
easily combined with a personal computer (PC), and 
layout setups which are currently done on the desktop 
of a PC can also be used to customize TV video data. 
Hence, compatibility between TV broadcast and PC can 
be improved, and the market in the field of digital hybrid 
products can be expected to be broadened. 



[01 33] A TV broadcast receiving apparatus according 
to the third embodiment of the present invention will be 
described in detail below with reference to the accom- 
panying drawings. Fig. 30 is a block diagram showing 
the arrangement of a TV broadcast receiving apparatus 
of the third embodiment. Note that the same reference 
numerals in the third embodiment denote the same 
parts as those in the first embodiment, and a detailed 
description thereof will be omitted. 
[0134] In the third embodiment as well, system data 
(including scene description data and additional data) is 
decoded by the system data decoding circuit 36. A cat- 
egory information detector 1 37 detects category infor- 
mation appended to a program from the decoded sys- 
tem data. The detected category information is input to 
the system controller 38, which generates commands in 
layout setups with reference to this information. Also, of 
the decoded system data, data that pertains to scene 
description is input to the scene description data con- 
version circuit 39. The remaining system data (including 
object information that represents the contents of ob- 
jects by commands) are input as various commands to 
the system controller 38. Note that the additional data 
may contain a document or the like such as a title index 
of a program or the like. 

[0135] Object information is assigned to each object 
like a title by a command set commonly used by respec- 
tive TV stations. Upon reception, the contents of the ob- 
ject can be discriminated and classified by analyzing the 
object information. This embodiment implements a lay- 
out setting function for laying out an object having des- 
ignated object information at a set position using the ob- 
ject information. 

[0136] Using the scene description data obtained by 
the scene description data conversion circuit 39, layout 
and composition of objects in the display controller 34, 
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and setups of the sound volume, sound field lateraliza- 
tion, and the like in the sound controller 30 are made. 
By adjusting the scene description data conversion cir- 
cuit 39 and controlling the display controller 39 under 
the control of the system controller 38, objects can be 
laid out at positions different from a basic layout, i.e., 
layout control upon setting an arbitrary layout can be 
done. The layout setting method will be explained later. 
[0137] The user can input various commands via the 
instruction input unit 45. Position adjustment in a layout 
setting process can be done based on the user instruc- 
tion input. That is, correction of the layout position and 
input of new setting values are made via the instruction 
input unit 45. The system controller 3B appropriately 
controls the operations of the respective units in accord- 
ance with input instruction values to obtain a desired 
output (display, reproduction) pattern. 

[Layout Setups] 

[0138] Layout setups for setting objects at predeter- 
mined positions in units of categories by discriminating 
category information can be implemented by two meth- 
ods. The first method sets a layout using layout setting 
data held as a pre-programmed factory default in the 
memory 41 . The second method uses layout setting da- 
ta of layouts which are arbitrary set by the user and held 
in the memory 41 in units of categories. 
[0139] Since the layout setting method has already 
been exemplified in the first embodiment, a detailed de- 
scription thereof will be omitted. 
[0140] Object information used to discriminate an ob- 
ject to be processed is necessary as a part of layout set- 
ting data. A display process is controlled by the system 
controller 38, and control data at that time, object infor- 
mation for discriminating the object to be processed, 
and layout setting data are held in the memory 41 as 
user layout setting data corresponding to a given cate- 
gory. 

[0141] TV broadcast data contains category informa- 
tion. Using this category information, layouts set in units 
of programs can be converted into data in correspond- 
ence with category information, and the converted data 
can be stored as layout setting data. As the storage lo- 
cation of layout setting data, the nonvolatile memory 41 
such as an EE PROM or the like is used. Upon detection 
of category information stored in the memory 41 from 
TV broadcast data, the system controller 38 controls the 
scene description data conversion circuit 39 and display 
controller 34 on the basis of the layout setting data cor- 
responding to the detected category information to 
make image display and sound reproduction in a layout 
set by the user. 

[0142] Layout setting data will be explained next. As 
layout setting data, default setting data which is pre-pro- 
grammed and held, and data set by the user are avail- 
able. As the user setting data, the object position upon 
setting a layout by the user is converted into data in ad- 



dition to object layout information on the basis of the ob- 
ject layout information obtained from scene description 
data, and the converted data is stored as layout setting 
data together with control data of respective units and 
s object information to be processed. As has already been 
described previously with reference to Fig. 1 2, the scene 
description data is information for laying out objects that 
form each scene in a tree pattern, and designating the 
display times and positions of the individual objects. 
io [0143] As another format of layout setting data, as 
shown in Fig. 22, when ON/OFF data indicating whether 
or not the object of interest is displayed, display position 
data obtained when the display position is two-dimen- 
sionally expressed by the x- and y-axes, and data indi- 
is eating size are held, they can be used as layout setting 
data. 

[0144] In the general format of an MPEG 4 bitstream 
shown in Fig. 24, the program contents, photo image 
object, sound object, CG object, and the like (although 
the types of objects vary depending on programs) are 
multiplexed in a database of objects 1 to 4. For example, 
in a live program of a baseball game, these objects cor- 
respond to a background object (sprite), photo image 
objects of players and the like, a synthetic image object 
of score indication, a sound object, and the like. In ad- 
dition, scene description information and additional data 
are multiplexed as system data in the bitstream. The ad- 
ditional data includes category information and object 
information. 

[0145] Figs. 31 and 32 show frame setup examples 
in a live program of a baseball game, and Figs. 33 and 
34 show display example of the live program of the 
baseball game. 

[0146] In the live program of the baseball game, as- 
sume that objects for which a layout can be set include 
a score indication object 31 0 and count indication object 
311 shown in Figs. 31 and 33, and a batting average 
indication object 312 shown in Figs. 32 and 34. Since 
these three objects are indispensable in the live pro- 
gram of the baseball game, but their display positions 
vary depending on broadcast stations, these objects are 
suitable upon setting a layout. These objects are syn- 
thetic image objects created by CG data or the like, but 
this embodiment is not limited to specific object types. 
[01 47] After the layout setting mode is started, the us- 
er can lay out these objects at arbitrary positions on the 
TV screen, i.e., desired positions or easy-to-see posi- 
tions by the aforementioned method while watching the 
screen. 

[01 48] In this manner, using the layout setting function 
of this embodiment, the score indication object 310, 
count indication object 311 , and batting average indica- 
tion object 31 2 can be displayed at default positions or 
positions set by the user in units of timings (scenes) at 
which those objects are displayed, as shown in one 
scene of the live program of the baseball game shown 
in Figs. 31 to 34. This layout display is set independently 
of broadcast stations. 
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[0149] Once the layout setting data is held, the layout 
setting function operates upon detection of identical cat- 
egory information, and an object to be processed is dis- 
criminated from object information. If the object to be 
processed is detected, it is automatically displayed at a 5 
position based on the held layout setting data at that 
display timing (scene). When the data configuration of 
object information varies in units of broadcast stations, 
the object information may be re-set. 

[Operation Sequence] 

[0150] Figs. 35 and 36 are flow charts for explaining 
the operation sequence of the TV broadcast receiving 
apparatus of this embodiment. Fig. 35 shows the flow 
upon setting a layout by the user, and Fig. 36 shows the 
flow upon displaying TV video data. 
[0151] In the layout setting mode shown in Fig. 35, an 
object for which a layout is set is selected from objects 
which form image data in TV information (step S21 ). The 
user lays out the selected (designated) object at an ar- 
bitrary position (step S22). Upon completion of layout of 
the selected object, it is checked if layout setups are to 
end (step S23). if a layout is to be set for another object, 
the flow returns to step S21 to repeat selection and lay- 
out of an object. Upon completion of layout setups, the 
positions of the objects for which the layout has been 
set are converted into data. Then, category information, 
object information, position data, and control data for the 
respective units of those objects are combined, and are 
stored as layout setting data in the memory 41 (step 
S24). 

[01 52] In the display mode shown in Fig. 36, TV infor- 
mation is received (step S31 ), and category information 
of a program is detected from system data appended to 
the TV information (step S32). The category information 
is sent from each broadcast station by appending infor- 
mation corresponding to the category (genre) of a pro- 
gram to system data using a command set or the like 
common to the respective broadcast stations, and is 
used to roughly classify the contents of programs. If pro- 
gram category information varies in units of broadcast 
stations, a re-setting means may be inserted to attain 
consistency among the broadcast stations. 
[0153] It is then checked if layout setting data corre- 
sponding to the detected category information has al- 
ready been saved (step S33). If no layout setting data 
is saved in correspondence with the category informa- 
tion, video data of TV broadcast is displayed in a basic 
layout sent from the broadcast station (step S34). 
[0154] On the other hand, if layout setting data is 
saved in correspondence with the category information, 
the layout setting data* corresponding to the detected 
category information is read out from the memory 41 
(step S35), and the system controller stands by to start 
control for changing the layout of the object to be proc- 
essed when object information recorded in that layout 
setting data appears. Hence, in step S36, objects other 



than those for which the layout is to be changed are dis- 
played in the basic layout, and a given object for which 
the layout is to be changed is displayed in the set layout 
at a display timing (scene) of that object. 
[0155] The display state in step S34 or S36 is main- 
tained until the program comes to an end or the user 
selects another channel to start reception of a new pro- 
gram. When reception of a new program is started, the 
current layout is reset, and the flow repeats itserf from 
the initial state of TV broadcast reception in step S31. 
[01 56] The third embodiment has exemplified the "live 
program of the baseball game" as a category of a pro- 
gram. However, the present invention is not limited to 
such specific category and can be similarly applied to a 
"live program of a soccer game" or categories of pro- 
grams other than sports. 

[01 57] As described above, according to the third em- 
bodiment, the viewer of digital TV broadcast can arbi- 
trarily set the layout of objects in correspondence with 
category information of a program. Hence, video display 
can be made in correspondence with the category of a 
program and user's favor, the quality of the audiovisual 
user interface can be improved, and more flexible TV 
program display can be presented to the user. 
[0158] When a layout is set for each category infor- 
mation of a program with reference to object information 
that indicates the contents of an object, the layout can 
be set for only a designated object by making classifi- 
cation and layout control of objects. 
[0159] Programs of an identical category can be pre- 
vented from being displayed in different layouts depend- 
ing on broadcast stations, and common objects can be 
displayed in a layout standardized in units of program 
categories independently of broadcast stations. 

Fourth Embodiment 

[01 60] A TV broadcast receiving apparatus according 
to the fourth embodiment of the present invention will 
be explained below. Note that the same reference nu- 
merals in the fourth embodiment denote the same parts 
as those in the first to third embodiments, and a detailed 
description thereof will be omitted. - 
[0161] The fourth embodiment will explain layout set- 
ups of objects in TV broadcast that uses an image en- 
coded by a coding scheme other than MPEG 4, e.g., an 
MPEG 2 image, as one MPEG 4 object instead, as in 
the second embodiment. 

[01 62] A case will be exemplified below with reference 
to Fig. 33 wherein video display of MPEG 4 TV broad- 
cast including an MPEG 2 image is made using the lay- 
out setting method described in the third embodiment. 
In the fourth embodiment, assume that a relay image 
401 as an image of the entire baseball live program, 
which includes a background and players, as shown in 
Fig. 33, is an MPEG 2 image. The score indication object 
31 0, count indication object 31 1 , and objects other than 
those described above according to the progress of the 
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game as other objects are MPEG 4 data. That is, Fig. 
33 shows a video display example of MPEG 4 TV broad- 
cast including an MPEG 2 image. Fig. 28 shows an ex- 
ample of an MPEG 4 bitstream at that time. 
[0163] The MPEG 4 bitstream shown in Fig. 28 is mul- 
tiplexed as object 2 with data of the baseball live image 
401 as an MPEG 2 datastream. The MPEG 2 datast- 
ream normally consists of three types of data, i.e. , audio 
data, video data, and system data (MPEG 2 additional 
information). In object 2, the MPEG 2 datastream seg- 
ments each having a predetermined size are multi- 
plexed in accordance with predetermined timing adjust- 
ment that pertains to transmission. Since some MPEG 
4 encoding/decoding circuits have downward compati- 
bility to MPEG 2, common circuits are used if necessary 
so as to avoid wasteful use of resources that pertain to 
encoding/decoding. 

[0164] In this manner, a layout can be set even for 
MPEG 4 TV broadcast containing image data and/or 
sound data encoded by MPEG 2, as has been described 
in the third embodiment. 

[0165] As for layout setting data of a display image, 
as in the third embodiment, position data of an object 
for which the layout has been changed by the user is 
calculated on the basis of object layout information ob- 
tained from scene description information, and is stored 
as layout setting data in correspondence with category 
information of a program, object information to be proc- 
essed, and control data for the respective units. Also, 
operations that pertain to display are the same as the 
third embodiment. 

[0166] According to the fourth embodiment, in addi- 
tion to the effects of the third embodiment, since TV in- 
formation encoded by MPEG 2 can be used in the 
MPEG 4 TV system, existing contents can be directly 

MPEG 4 data, thus providing a very effective system 
which is easy to use. 

Fifth Embodiment 

[Arrangement] 

[01 67] A TV broadcast receiving apparatus according 
to the fifth embodiment of the present invention will be 
described in detail below with reference to the accom- 
panying drawings. Fig. 37 is a block diagram showing 
the arrangement of a TV broadcast receiving apparatus 
of the fifth embodiment. Note that the same reference 
numerals in the fifth embodiment denote the same parts 
as those in the first embodiment, and a detailed descrip- 
tion thereof will be omitted. 

[0168] In the fifth embodiment as well, system data 
(including scene description data and additional data) is 
decoded by the system data decoding circuit 36. A time 
information detector 237 detects time information (clock 
data) included in additional information in the system da- 
ta from the decoded system data. The detected time in- 



formation is input to the system controller 38, which gen- 
erates commands in layout setups with reference to this 
information. Also, of the decoded system data, data that 
pertains to scene description is input to the scene de- 
5 scription data conversion circuit 39. The remaining sys- 
tem data (including object information that represents 
the contents of objects by commands) are input as var- 
ious commands to the system controller 38. Note that 
the additional data may contain a document or the like 
such as a title index of a program or the like. 
[0169] Object information is assigned to each object 
like a title using a command set (code) common to the 
respective TV stations, a command set (code) set for 
each station, or the like. Upon reception, by analyzing 
the object information, the contents of the correspond- 
ing object can be discriminated and classified. This em- 
bodiment implements a layout setting function for laying 
out an object having designated object information at a 
set position using the object information. 
[0170] Using the scene description data obtained by 
the scene description data conversion circuit 39, layout 
and composition of objects in the display controller 34, 
and setups of the sound volume, sound field lateraliza- 
tion, and the like in the sound controller 30 are made. 
By adjusting the scene description data conversion cir- 
cuit 39 and controlling the display controller 39 under 
the control of the system controller 38, objects can be 
laid out at positions different from a basic layout, i.e., 
layout control upon setting an arbitrary layout can be 
done. The layout setting method will be explained later. 
[0171] When a display image which is not received as 
an object, for example, a time indication frame, title in- 
dex, or the like is generated inside the receiving appa- 
ratus, the character generation circuit 40 is used. Under 
the control of the system controller 38, a time indication 
tinaracteris^gBniBTatBa 1 using the"me™5n/ 42 such as a 
ROM or the like that stores character data, on the basis 
of time data contained in the additional data, time infor- 
mation acquired from a calendar (timepiece) function 
unit 47 in the receiving apparatus, or the like. The same 
applies to a title index. The generated image is synthe- 
sized in the display controller 34. 

[Layout Setups] 

[0172] Layout setups in this embodiment are classi- 
fied based on the time base as a combination of units 
such as a time band, days of the week, or the like. Upon 
making actual display in the set layout, if layout setting 
data classified in a time band including the current time 
is found, predetermined setting operation is executed in 
correspondence with the found data. There are two 
sources of time information used to discriminate the cur- 
rent time, which serves as a key upon classifying layout 
setups. One source is the calendar (timepiece) function 
unit 47 in the receiving apparatus, and the other source 
is time information contained in the system data. This 
embodiment can be implemented using either one of 
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these sources. 

[0173] A layout setup that displays a designated ob- 
ject contained in an image in a predetermined layout in 
correspondence with a predetermined time band or day 
of the week can be executed by the following method. 
That is, layout setting data arbitrarily set by the user are 
held in the memory 41 while being classified based on 
predetermined time bands or days of the week, and are 
used. 

[0174] Since the layout setting method has already 
been exemplified in the first embodiment, a detailed de- 
scription thereof will be omitted. Object information used 
to discriminate an object to be processed is necessary 
as a part of layout setting data. The display process is 
controlled by the system controller 38, and control data 
at that time, object information for discriminating the ob- 
ject to be processed, layout setting data, and a time unit 
command of the time band or day of the week at which 
the set layout display is executed are input and held in 
the memory 41 as user layout setting data in corre- 
spondence with each other. 

[0175] A process for a sound object will be explained 
below. Fig. 38 is a diagram for explaining the output con- 
trol of a sound object in correspondence with layout set- 
ting data. The right and left levels of an input stereo 
sound object 91 are respectively adjusted by amplifiers 
93 and 92 on the basis of gains 95 and 96 controlled by 
a system controller 94. Audio (R) and (L) outputs 98 and 
97 are obtained from the outputs of these amplifiers 93 
and 92. When the system controller 94 adjusts the gains 
95 and 96 in accordance with the layout setting data, 
the balance between the right and left audio output lev- 
els and the sound volume can be adjusted, and sound 
field lateralization between the right and left channels 
can be controlled. That is, by adjusting these gain values 
upon layout setups, a change in layout of the sound ob- 
ject is implemented. In this manner, the sound volume 
adjustment and sound field lateralization setups can be 
achieved. 

[0176] A sound image and sound field lateralization 
will be supplementary explained with reference to Fig. 
39. Sound field lateralization is to form a sound image 
in a sound field space by adjusting the balance (ratio) 
between sound volumes output from the right and left 
loudspeakers (SP-R and SP-L) shown in Fig. 39 and the 
overall sound volume. The sound field space is located 
in a space that connects the viewing/listening position 
and the right and left loudspeakers, and the sound im- 
age moves on two axes, i.e., the right-and-left and back- 
and-forth axes and can be set at an appropriate position 
in the sound field space. By exploiting this concept, the 
right and left audio output levels (sound volume bal- 
ance) and sound volume are adjusted based on layout 
setting data to adjust the outputs from the right and left 
loudspeakers, thus setting sound field lateralization up- 
on change in layout. By adjusting phase and reverber- 
ation components using a surround speaker system or 
the like, sound field lateralization can be freely three- 



dimensionally set through 360°. 
[01 77] As described above, the user can set a layout. 
The set layout setting data can be stored while being 
classified in units of predetermined periods (time bands, 
s days of the week, or the like). As the storage location of 
layout setting data, the nonvolatile memory 41 such as 
an EEPROM or the like is used. Upon detection of a time 
corresponding to the time band or day of the week set 
by the user to change the layout or to the default time 
band or day of the week from time information, the sys- 
tem controller 38 reads out layout setting data stored in 
the memory 41 and corresponding to the time band or 
day of the week. The system controller 38 controls the 
scene description data conversion circuit 39 and display 
controller 34 on the basis of the readout layout setting 
data to make image display and sound reproduction in 
a layout set by the user. 

[0178] Fig. 40 shows the format of a general MPEG 
4 bitstream. The program contents, photo image object, 
sound object, CG object, and the like (although the types 
of objects vary depending on programs) are multiplexed 
in a database of objects 1 to 5. For example, in a news 
program, these objects correspond to a background ob- 
ject (sprite), photo image objects of a newscaster and 
the like, synthetic image objects such as a weather fore- 
cast, time indication, and the like, a sound object, and 
the like. In addition, scene description information and 
additional data are multiplexed as system data in the 
bitstream. The additional data includes time information, 
object information, and other information. Object infor- 
mation includes a genre code indicating a genre to 
which each of objects corresponding to objects 1 to 5 
belongs, an object code indicating the details of the ob- 
ject, and a broadcast station code required when the ob- 
ject is unique to a given broadcast station. 
[0179] Figs. 41 and 42 show frame setup examples 
by the user. After the layout setting mode is started, the 
user executes layout setups by the aforementioned 
method while watching the screen. 
[0180] A basic image 411 shown in Figs. 41 and 42 is 
obtained by normally displaying an image sent from the 
broadcast station. According to this embodiment, since 
a layout can be arbitrarily set, the user can set a layout 
in advance so that a time indication object 412 in the 
basic image 411 is displayed in an enlarged scale in the 
time band of weekday mornings (e.g., am 7 to 8), as 
shown in Fig. 41 . Note that this time band can be arbi- 
trarily set, as described above. 
[0181] Also, the user can set a layout so that a time 
indication object 41 3 is cleared from the basic image 
411, and a weather forecast object 414 is displayed in 
an enlarged scale at a changed position in the time band 
of holiday mornings, as shown in Fig. 42. 
[0182] In this manner, the day of the week and time 
band can be appropriately combined with various ob- 
jects, and a frame whose layout has been changed can 
be displayed in units of time bands to be set. Once the 
layout is set, when the current time is included in the set 
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time band, the held layout setting data is read out to ac- 
tivate the layout change function. An object to be proc- 
essed is discriminated based on its object information, 
and the layout is changed to automatically display that 
object at a predetermined position. 
[0183] Note that the aforementioned layout setting 
data are not limited to those set by the user but may be 
default ones, which were set upon delivery of the receiv- 
ing apparatus from a factory so as to function in a pre- 
determined time band. 

[0184] Object information and layout setting data will 
be described in detail below with reference to Figs. 43 
and 44. Fig. 43 shows the detailed code configuration 
of object information in units of broadcast stations. Fig. 
44 shows the structure of layout setting data. 
[0185] The detailed configuration of object informa- 
tion that has been explained using Fig. 40 is classified, 
as shown in, e.g., Fig. 43. As shown in Fig. 43, genre 
codes are classified into, e.g., "news', "professional 
baseball", "wideshow", and the like. When the genre 
code is, e.g., "news", object codes are classified into 
"time indication object", "weather forecast object", 
"newscaster image object", and the like. When the gen- 
re code is "professional baseball" or "wideshow", object 
codes shown in Fig. 43 are stored. Such detailed con- 
figurations of object information are present in units of 
broadcast stations. Code lists for various objects that 
represent the configuration of object information are 
prepared in advance using codes in units of broadcast 
stations or those common to the respective stations. In 
addition, the broadcast stations and receiving appara- 
tuses on the viewer side are set to be able to understand 
identical codes. 

[0186] Also, the layout setting data may have both 
"default setting modes" and "user setting modes", as 
shown^ in «Fig. 44. • • ' 

[0187] The default setting modes include a "good 
morning" mode (functions: displaying time indication in 
enlarged scale, increasing sound volume, and the like) 
for mornings, "good night" mode (functions: setting a rel- 
atively low sound volume, and the like) for nights, a "go 
out" mode (functions: displaying time indication and 
weather forecast in enlarged scale, and the like) for 
weekday mornings, a "holiday" mode (functions: clear 
time indication, and the like) for weekend mornings, and 
the like in correspondence with the days of week and 
time bands. Object information indicating an object for 
which the layout is to be changed, default position data, 
control data of the respective units, broadcast station 
data, and the like are saved as necessary data in units 
of default setting modes. 

[01 88] In each user setting mode, the user sets a lay- 
out by the aforementioned setting method in units of ar- 
bitrary time bands or days of the week, and saves object 
information indicating an object for which the layout is 
to be changed, set position data, control data of the re- 
spective units, broadcast station data, and the like as 
layout setting data. In Fig. 44, user setting modes are 



set in time bands, i.e., user setup 1 "19:00 to 21:00 on 
Monday", user setup 2 "21 :00 to 22:00 on Wednesday", 
user setup 3 "12:00 to 13:00 on Monday, Wednesday, 
and Friday", and user setup 4 "7:30 to 8:30 everyday". 
5 In the user setting modes, arbitrary layouts can be set 
for various image objects such as a person, telop, and 
the like, and a sound object. Using broadcast station da- 
ta, the system can be activated using a given broadcast 
station as a designation condition. 

10 

[Operation Sequence] 

[0189] Figs. 45 and 46 are flow charts for explaining 
the operation sequence of the TV broadcast receiving 
is apparatus of this embodiment. Fig. 45 shows the flow 
upon setting a layout by the user, and Fig. 46 shows the 
flow upon displaying TV video data. 
[0190] In the layout setting mode shown in Fig. 45, a 
time band in which the layout is to be changed is input 
20 (step S41 ). The user sets the time band by inputting one 
or a plurality of combinations of setups such as in units 
of days of the week, in units of dates, start to end times, 
and the like using units such as year, month, day of the 
week, date, time, minute, and the like. Furthermore, the 
25 user can input periods such as every week, every other 
week, a certain number of days, and the like. 
[0191] Subsequently, the user selects an object for 
which the layout is to be changed from objects that form 
image data in TV information (step S42). The user lays 
30 out the selected (designated) object at an arbitrary po- 
sition (step S43). At this time, display ON/OFF of the 
object is simultaneously set. Upon completion of setting 
of the selected object, it is checked if layout setups are 
to end (step S44). If a layout is to be set for another 
35 object, the flow returns to step S42 to repeat selection 
and layout of an object. Upon completion of layout set- 
ups, the positions of the objects for which the layout has 
been set are converted into data. Then, object informa- 
tion, position data, and control data for the respective 
40 units of each object are combined, and are stored in the 
memory 41 as layout setting data in correspondence 
with the input time band (step S45). Note that broadcast 
station (channel) data may be appended as layout set- 
ting data. 

45 [01 92] In the display mode shown in Fig. 46, TV infor- 
mation is received (step S51), and time information in- 
dicating the current time is detected (step S52). The time 
information is acquired and detected from the calendar 
(timepiece) function unit 47 in the receiving apparatus 
50 or TV broadcast system data. 

[0193] It is then checked based on the detected time 
information if layout setting data corresponding to the 
current time as a command has already been stored in 
the memory 41 (step S53). If no layout setting data cor- 
55 responding to the current time is stored, video data of 
TV broadcast is displayed in a basic layout sent from 
the broadcast station (step S54). 
[0194] If layout setting data corresponding to the cur- 
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rent time is stored, that layout setting data is read out 
from the memory 41 (step S55), and the system control- 
ler stands by to start control tor changing the layout of 
the object to be processed when object information re- 
corded in that layout setting data appears. That is, in s 
step S56, objects other than those for which the layout 
is to be changed are displayed in the basic layout, and 
a given object for which the layout is to be changed is 
displayed in the set layout at the display timing (scene) 
of that object. 

[0195] The display state in step S54 or S56 is main- 
tained until the program comes to an end or the user 
selects another channel to start reception of a new pro- 
gram. When reception of a new program is started, the 
current layout is reset, and the flow repeats itself from 
the initial state of TV broadcast reception in step S51 . 
[01 96] As described above, according to the fifth em- 
bodiment, TV frame display in a display layout which 
gives priority to arbitrary information can be made in cor- 
respondence with the day of the week or time band. 
Hence, video display according to user's favor can be 
achieved, the quality of the audiovisual user interface 
can be improved, and more flexible TV program display 
can be presented to the user by easy operations. 

Sixth Embodiment 

[01 97] A TV broadcast receiving apparatus according 
to the sixth embodiment of the present invention will be 
described below. Note that the same reference numer- 
als in the sixth embodiment denote the same parts as 
those in the first to fifth embodiments, and a detailed 
description thereof will be omitted. 
[0198] The sixth embodiment will explain layout set- 
ups of objects in TV broadcast that uses an image en- 
coded by a coding scheme other than MPEG 4, e.g., an 
MPEG 2 image, as one MPEG 4 object instead, as in 
the second embodiment. 

[0199] A case will be exemplified below using Figs. 
41 and 42 wherein video display of MPEG 4 TV broad- 
cast including an MPEG 2 image is made using the lay- 
out setting method described in the fifth embodiment. In 
the sixth embodiment, assume that a relay image object 
displayed on a region 415 is an MPEG 2 image as an 
example of a photo image object handled in a news pro- 
gram shown in Fig. 41 or 42. Other objects are MPEG 
4 data. That is, Figs. 41 and 42 show video display ex- 
amples of MPEG 4 TV broadcast including an MPEG 2 
image. Fig. 28 shows an example of an MPEG 4 bit- 
stream at that time. 

[0200] The MPEG 4 bitstream shown in Fig. 28 is mul- 
tiplexed as object 2 with data of the relay image 415 as 
an MPEG 2 datastream. The MPEG 2 datastream nor- 
mally consists of three types of data, i.e., audio data, 
video data, and system data (MPEG 2 additional infor- 
mation). In object 2, the MPEG 2 datastream segments 
each having a predetermined size are multiplexed in ac- 
cordance with predetermined timing adjustment that 



pertains to transmission. Since some MPEG 4 encod- 
ing/decoding circuits have downward compatibility to 
MPEG 2, common circuits are used if necessary so as 
to avoid wasteful use of resources that pertain to encod- 
ing/decoding. 

[0201] In this manner, a layout can be set even for 
MPEG 4 TV broadcast containing image data and/or 
sound data encoded by MPEG 2, as has been described 
in the fifth embodiment. 

[0202] As for layout setting data of a display image, 
as in the fifth embodiment, position data of an object for 
which the layout has been changed by the user is cal- 
culated on the basis of object layout information ob- 
tained from scene description information, and is stored 
as layout setting data in correspondence with the time 
band, object information to be processed, control data 
for the respective units, and broadcast station (channel) 
data if necessary. Also, operations that pertain to display 
are the same as the fifth embodiment. 
[0203] In case of a system using both MPEG 2 and 
MPEG 4, time information may be acquired based on 
the time stamp inserted in MPEG 2 system data. 
[0204] As described above, according to the sixth em- 
bodiment, in addition to the effects of the fifth embodi- 
ment, since TV information encoded by MPEG 2 can be 
used in the MPEG 4 TV system, existing contents can 
be directly used, and MPEG 2 data need not be convert- 
ed into MPEG 4 data, thus providing a very effective sys- 
tem which is easy to use. 

Modifications 

[0205] In the sixth embodiment described above, a 
datastream multiplexed with MPEG 2 data as one 
MPEG 4 object is received. Furthermore, the present 
invention can be applied even when various kinds of in- 
formation that pertain to layout setups are included as 
additional data in MPEG 2 system data, and substan- 
tially the same effects as those obtained by an MPEG 4 
bitstream can be obtained. 

[0206] A method of multiplexing an MPEG 4 datast- 
ream on an MPEG 2 datastream as TV information will 
be explained below. 

[0207] The general MPEG 4 datastream format is as 
shown in Fig. 40 above. Fig. 47 shows the MPEG 2 
transport stream structure, i.e., the transmission format 
of an MPEG 2 datastream. A method of multiplexing an 
MPEG 4 datastream on an MPEG 2 datastream will be 
explained below using Fig. 47. 
[0208] An MPEG 2 transport stream is obtained by 
multiplexing into transport packets each having a fixed 
length. The data structure of each transport packet is 
hierarchically expressed, as shown in Fig. 47, and in- 
cludes items shown in Fig. 47. These items will be ex- 
plained in turn below. 

[0209] That is, each transport packet includes an 8-bit 
'sync signal (sync)", an "error indicator 1 indicating the 
presence/absence of any bit error in a packet, "unit start" 
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1. 



indicating that a new unit starts from the payload of this Claims 
packet, "priority (packet priority)" indicating the impor- 
tance level of this packet, "PID (packet identification) 0 
indicating an attribute of an individual stream, "scramble 
control" indicating the presence/absence and type of s 
scramble, "adaptation field control" indicating the pres- 
ence/absence of an adaptation field and the presence/ 
absence of a payload in this packet, a "cyclic counter" 
as information for detecting whether some packets hav- 
ing identical PID are discarded in the middle of trans- to 
mission, an "adaptation field" that can store additional 
information or stuffing byte as an option, and a payload 
(image or sound information). 
[0210] The adaptation field consists of a field length, 
various items pertaining to other individual streams, an is 
optional field, and stuffing byte (invalid data byte). 
[0211] For example, an MPEG 4 datastream as sub 
image or sound data of TV information, and an ID for 
identifying that datastream are considered as ones of 
additional data in the optional field, and are multiplexed 20 
in the optional field. That is, main TV information is an 
MPEG 2 datastream (transport stream). As exemplified 
in Fig. 47, an MPEG 4 datastream is formed by combin- 
ing image objects (objects A and B) such as a photo 
image, CG, character, and the like having a small data 25 
size, a sound object (object C), scene description infor- 
mation (BIFS), and other necessary data (sub data). By 
multiplexing this MPEG 4 datastream as a part of the 
optional field in the MPEG 2 system data, transmission 
of MPEG 2/MPEG 4 multiplexed datastream can be im- 30 
plemented. 

[0212] Note that an arbitrary layout can be set for the 
image objects having a small data size like the afore- 
mentioned MPEG 4 objects. The method and opera- 
tions that pertain to layout setups are the same as those 35 
Jnathe^atqiff^ 

stamp may be used as time information upon layout set- 
ups. 

[0213] Information for setting a layout for an image 
generated by character generation means in the receiv- 40 
ing apparatus can also be multiplexed in MPEG 2 sys- 6. 
tern data. 

[0214] In this manner, the present invention can be 
applied not only to MPEG 4 TV broadcast but also to 
MPEG 2 and various other digital TV broadcast sys- 45 
terns. Also, an MPEG 4 bitstream can be used in an 
MPEG 2 TV broadcast system. Hence, an existing TV 
broadcast system can be utilized. 
[0215] As many apparently widely different embodi- 
ments of the present invention can be made without de- so 
parting from the scope thereof, it is to be understood 
that the invention is not limited to the specific embodi- 7. 
ments thereof except as defined in the appended 
claims. 

55 



A receiving apparatus capable of reproducing im- 
age data and/or sound data, comprising: 

reception means for receiving information con- 
sisting of image data, sound data, and addition- 
al system data; 

reproducing means for reproducing received 
image and sound data on the basis of the sys- 
tem data; and 

setting means for setting reproduction patterns 
in units of objects when the received image da- 
ta has a data format segmented in units of ob- 
jects. 

The apparatus according to claim 1, further com- 
prising a memory for storing the reproduction pat- 
terns set in units of objects in correspondence with 
information indicating a broadcast program con- 
tained in the system data. 

The apparatus according to claim 2, wherein said 
reproducing means reproduces the received image 
and sound data on the basis of the reproduction pat- 
tern read out from said memory when the reproduc- 
tion pattern corresponding to information indicating 
a broadcast program included in the received sys- 
tem data is stored in said memory. 

The apparatus according to claim 2, wherein the re- 
production pattern includes at least one of a display/ 
non -display setup of an object, movement of a dis- 
play position, and a change in display size. 



<*5:?*^Ti!iLeiaRpi^^^^ 

formation is digital television broadcast, which 
broadcasts image and sound data encoded by 
MPEG 4. 



A receiving method of reproducing image data and/ 
or sound data, comprising the steps of: 

receiving information consisting of image data, 
sound data, and additional system data; 
reproducing the received image and sound da- 
ta on the basis of the system data; and 
setting reproduction patterns in units of objects 
when the received image data has a data for- 
mat segmented in units of objects. 

The method according to claim 6, further compris- 
ing the step of storing the reproduction patterns set 
in units of objects in a memory in correspondence 
with information indicating a broadcast program 
contained in the system data. 



8. The method according to claim 7, further compris- 
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ing the step of controlling reproduction of the re- 
ceived image and sound data in the reproduction 
step on the basis of the reproduction pattern read 
out from the memory when the reproduction pattern 
corresponding to information indicating a broadcast 
program included in the received system data is 
stored in the memory. 

9. The method according to claim 6, wherein the re- 
production pattern includes at least one of a display/ 
non-display setup of an object, movement of a dis- 
play position, and a change in display size. 

10. The method according to claim 6, wherein the infor- 
mation is digital television broadcast, which broad- 
casts image and sound data encoded by MPEG 4. 

11. A computer program product comprising a compu- 
ter readable medium having a computer program 
code, for a method of receiving information, and re- 
producing image data and/or sound data, said prod- 
uct comprising: v 

receiving process procedure code for receiving 
information consisting of image data, sound da- 
ta, and additional system data; 
reproducing process procedure code for repro- 
ducing received image and sound data on the 
basis of the system data; and 
setting process procedure code for setting re- 
production patterns in units of objects when the 
received image data has a data format seg- 
mented in units of objects. 

12. A receiving apparatus for receiving MPEG 4 infor- 
mation including image data and/or sound data en- 
coded by another coding scheme, comprising: 

first decoding means for decoding information 
encoded by MPEG 4; 

second decoding means for decoding the im- 
age data and/or sound data encoded by the 
other coding scheme; and 
synthesizing means for synthesizing a plurality 
of image data and/or sound data decoded by 
said first and second decoding means. 

13. The apparatus according to claim 12, wherein said 
second decoding means decodes image data and/ 
or sound data encoded by MPEG 2. 

14. The apparatus according to claim 12, further com- 
prising reproducing means for reproducing the im- 
age data and/or sound data synthesized by said 
synthesizing means. 

15. The apparatus according to claim 13, further com- 
prising setting means for setting a synthetic pattern 



of the plurality of image data to be synthesized by 
said synthesizing means and a reproduction pattern 
by said reproduction means. 

5 16. The apparatus according to claim 15, further com- 
prising a memory for storing the reproduction pat- 
tern set by said setting means in correspondence 
with information indicating a broadcast program in- 
cluded in the received information. 

10 

17. A receiving method for receiving MPEG 4 informa- 
tion including image data and/or sound data encod- 
ed by another coding scheme, comprising the steps 
of: 

is 

decoding information encoded by MPEG 4; 
decoding the image data and/or sound data en- 
coded by the other coding scheme; and 
synthesizing a plurality of decoded image data 
20 and/or sound data. 

18. The method according to claim 17, wherein the oth- 
er coding scheme is MPEG 2. 

25 1 9. The method according to claim 1 7, further compris- 
ing the step of reproducing the image data and/or 
sound data synthesized in the synthesizing step. 

20. The method according to claim 1 9, further compris- 
30 ing the step of setting a synthetic pattern of the plu- 
rality of image data in the synthesizing step and a 
reproduction pattern in the reproducing step. 

21. The method according to claim 20, further compris- 
es ing the step of storing the reproduction pattern set 

in the setting step in a memory in correspondence 
with information indicating a broadcast program in- 
cluded in the received information. 

40 22. A computer program product comprising a compu- 
ter readable medium having a computer program 
code, for a receiving method for receiving MPEG 4 
information including image data and/or sound data 
encoded by another coding scheme, said product 
45 comprising: 

first decoding process procedure code for de- 
coding information encoded by MPEG 4; 
second decoding process procedure code for 
so decoding the image data and/or sound data en- 

coded by the other coding scheme; and 
synthesizing process procedure code for syn- 
thesizing a plurality of decoded image data 
and/or sound data. 

55 

23. A receiving apparatus comprising: 

reception means for receiving a digital data se- 
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quence; 

decoding means for decoding image data, 
sound data, and system data from the received 
digital data sequence; 

setting means for setting a reproduction pattern s 
corresponding to category information which is 
included in the system data and indicates con- 
tents of the received digital data sequence; and 
control means for controlling the reproduction 
pattern of the decoded image data and/or 10 
sound data on the basis of the decoded system 
data and set reproduction pattern. 



category information which is included in the 
system data and indicates contents of the re- 
ceived digital data sequence; and 
controlling the reproduction pattern of the de- 
coded image data and/or sound data on the ba- 
sis of the decoded system data and set repro- 
duction pattern. 

30. The method according to claim 29, wherein the dig- 
ital data sequence is television broadcast, which 
broadcasts image data and sound data encoded by 
MPEG 4. 



24. The apparatus according to claim 23, wherein the 
digital data sequence is television broadcast, which 
broadcasts image data and sound data encoded by 
MPEG 4. 

25. The apparatus according to claim 23, further com- 
prising a memory for storing the reproduction pat- 
tern corresponding to the category information in 
correspondence with the category information and 
object information indicating contents of an object 
that forms an image. 

26. The apparatus according to claim 25, wherein said 
setting means reads out the reproduction pattern 
corresponding to the category information, and said 
control means controls a layout of an object corre- 
sponding to the object information, which is stored 
in correspondence with the readout reproduction 
pattern. 

27. The apparatus according to claim 23, further com- 
prising: 

setting means for manually setting a layout of 
a predetermined object; and 
a memory for storing the layout set by said set- 
ting means together with the category informa- 
tion and object information of the predeter- 
mined object as information indicating the re- 
production pattern. 

28. The apparatus according to claim 27, wherein said 
setting means reads out the reproduction pattern 
corresponding to the category information, and said 
control means controls a layout of an object corre- 
sponding to the object information, which is stored 
in correspondence with the readout reproduction 
pattern. 

29. A receiving method comprising the steps of: 

receiving a digital data sequence; 
decoding image data, sound data, and system 
data from the received digital data sequence; 
setting a reproduction pattern corresponding to 



31 . The method according to claim 29, further compris- 
es ing the step of storing the reproduction pattern cor- 
responding to the category information in a memory 
in correspondence with the category information 
and object information indicating contents of an ob- 
ject that forms an image. 

20 

32. The method according to claim 31 , wherein the set- 
ting step includes the step of reading out the repro- 
duction pattern corresponding to the category infor- 
mation, and the control step includes the step of 

25 controlling a layout of an object corresponding to 
the object information, which is stored in corre- 
spondence with the readout reproduction pattern. 

33. The method according to claim 29, further compris- 
30 jng: 

manually setting a layout of a predetermined 
object; and 

storing the set layout in a memory together with 
55 the category information and object information 

■ * of the pre determ ined object as information in - 

dicating the reproduction pattern. 

34. The method according to claim 33, wherein the set- 
40 ting step includes the step of reading out the repro- 
duction pattern corresponding to the category infor- 
mation, and the control step includes the step of 
controlling a layout of an object corresponding to 
the object information, which is stored in cor re - 

45 spondence with the readout reproduction pattern. 

35. A computer program product comprising a compu- 
ter readable medium having a computer program 
code, for a receiving method, said product compris- 

50 ing: 

receiving process procedure code for receiving 
a digital data sequence; 
decoding process procedure code for decoding 
55 image data, sound data, and system data from 

the received digital data sequence; 
setting process procedure code for setting a re- 
production pattern corresponding to category 
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information which is included in the system data 
and indicates contents of the received digital 
data sequence; and 

controlling process procedure code for control- 
ling the reproduction pattern of the decoded im- 
age data and/or sound data on the basis of the 
decoded system data and set reproduction pat- 
tern. 

36. A receiving apparatus comprising: 



mined object as information indicating the re- 
production pattern. 

41. The apparatus according to claim 40, wherein said 
5 setting means reads out the reproduction pattern 
corresponding to the category information, and said 
control means controls a layout of an object corre- 
sponding to the object information, which is stored 
in correspondence with the readout reproduction 
10 pattern. 



reception means for receiving a digital data se- 
quence which is encoded by MPEG 4 and in- 
cludes image data and/or sound data encoded 
by another coding scheme; is 
first decoding means for decoding image data, 
sound data, and system data from the digital 
data sequence encoded by MPEG 4; 
second decoding means for decoding the im- 
age data and/or sound data encoded by the 20 
other coding scheme; 

setting means for setting a reproduction pattern 
corresponding to category information which is 
included in the system data and indicates con- 
tents of the received digital data sequence; and 25 
control means for controlling the reproduction 
pattern of the image data and/or sound data de- 
coded by said first and second decoding means 
on the basis of the decoded system data and 
set reproduction pattern. 30 



37. The apparatus according to claim 36, wherein said 
second decoding means decodes image data and/ 
or sound data encoded by MPEG 2. 

38. The apparatus according to claim 36, further com- 
prising a memory for storing the reproduction pat- 
tern corresponding to the category information in 
correspondence with the category information and 
object information indicating contents of an object 
that forms an image. 



42. A receiving method comprising the steps of: 

receiving a digital data sequence which is en- 
coded by MPEG 4 and includes image data 
and/or sound data encoded by another coding 
scheme; 

decoding image data, sound data, and system 
data from the digital data sequence encoded by 
MPEG 4; 

decoding the image data and/or sound data en- 
coded by the other coding scheme; 
setting a reproduction pattern corresponding to 
category information which is included in the 
system data and indicates contents of the re- 
ceived digital data sequence; and 
controlling the reproduction pattern of the im- 
age data and/or sound data decoded in the first 
and second decoding steps on the basis of the 
decoded system data and set reproduction pat- 
tern. 

43. The method according to claim 42, wherein the oth- 
er coding scheme is MPEG 2. 

35 

44. The method according to claim 42, further compris- 
ing the step of storing the reproduction pattern cor- 
responding to the category information in a memory 
in correspondence with the category information 

40 and object information indicating contents of an ob- 
ject that forms an image. 



39. The apparatus according to claim 38, wherein said 
setting means reads out the reproduction pattern 
corresponding to the category information, and said 
control means controls a layout of an object corre- 
sponding to the object information, which is stored 
in correspondence with the readout reproduction 
pattern. . 

40. The apparatus according to claim 36, further com- 
prising: 

setting means for manually setting a layout of 
a predetermined object; and 
a memory for storing the layout set by said set- 
ting means together with the category informa- 
tion and object information of the predeter- 



45. The method according to claim 44, wherein the set- 
ting step includes the step of reading out the repro- 

4S duction pattern corresponding to the category infor- 
mation, and the control step includes the step of 
controlling a layout of an object corresponding to 
the object information, which is stored in corre- 
spondence with the readout reproduction pattern. 

50 

46. The method according to claim 42, further compris- 
ing: 

manually setting a layout of a predetermined 
55 object; and 

storing the set layout in a memory together with 
the category information and object information 
of the predetermined object as information in- 
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dicating the reproduction pattern. 

47. The method according to claim 46, wherein the set- 
ting step includes the step of reading out the repro- 
duction pattern corresponding to the category infor- 
mation, and the control step includes the step of 
controlling a layout of an object corresponding to 
the object information, which is stored in corre- 
spondence with the readout reproduction pattern. 

48. A computer program product comprising a compu- 
ter readable medium having a computer program 
code, for a receiving method, said product compris- 
ing: 

receiving process procedure code for receiving 
a digital data sequence which is encoded by 
MPEG 4 and includes image data and/or sound 
data encoded by another coding scheme; 
first decoding process procedure code for de- 
coding image data, sound data, and system da- 
ta from the digital data sequence encoded by 
MPEG 4; 

second decoding process procedure code for 
decoding the image data and/or sound data en- 
coded by the other coding scheme; 
setting process procedure code for setting a re- 
production pattern corresponding to category 
information which is included in the system data 
and indicates contents of the received digital 
data sequence; and 

controlling process procedure code for control- 
ling the reproduction pattern of the image data 
and/or sound data decoded in the first and sec- 
ond decoding steps on the basis of the decoded 
^^syfitemidata^an 

49. A receiving apparatus comprising: 



52. The apparatus according to claim 49, wherein said 
control means identifies an object to be controlled 
on the basis of the system data 

s 53. The apparatus according to claim 49, wherein said 
obtaining means obtains the current time from the 
system data. 

54. The apparatus according to claim 49, further com- 
10 prising a memory for holding a plurality of reproduc- 
tion patterns, and wherein said control means re- 
produces an image based on the reproduction pat- 
tern when the reproduction pattern corresponding 
to the time information is held in said memory. 

15 

55. The apparatus according to claim 49, further com- 
prising: 

setting means for manually setting a layout of 
20 a predetermined object in correspondence with 

the time information; and 
a memory for holding the layout set by said set- 
ting means together with object information of 
the predetermined object as information indi- 
25 eating the reproduction pattern. 

56. The apparatus according to claim 55, wherein said 
setting means sets at least one of a reproduction 
ON/OFF state, reproduction position, and repro- 

30 duction size of the predetermined object. 

57. The apparatus according to claim 55, wherein said 
control means reads out the reproduction pattern 
corresponding to the time information from said 

35 memory, and controls a layout of an object corre- 

in correspondence with the readout reproduction 
pattern. 



reception means for receiving a digital data se- 
quence; 

decoding means for decoding image data and 
system data from the received digital data se- 
quence; 

obtaining means for obtaining a current time; 
control means for controlling a reproduction 
pattern of the decoded image data on the basis 
of time information obtained by said obtaining 
means. 

50. The apparatus according to claim 49, wherein the 
digital data sequence is television broadcast, which 
broadcasts image data and sound data encoded by 
MPEG 4. 

51. The apparatus according to claim 49, wherein said 
control means controls the reproduction pattern of 
the image data in units of objects of the image data. 



40 58. The apparatus according to claim 49, wherein said 
decoding means further decodes sound data from 
the digital data sequence, and said control means 
further controls a reproduction pattern of the decod- 
ed sound data on the basis of the system data and 

45 the time information. 

59. The apparatus according to claim 58, wherein said 
control means controls a reproduction level and/or 
sound field lateralization of a sound object. 

50 

60. A receiving method comprising the steps of: 

receiving a digital data sequence; 
decoding image data and system data from the 
55 received digital data sequence; 

obtaining a current time; and 
controlling a reproduction pattern of the decod- 
ed image data on the basis of obtained time in- 
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formation. 

61 . The method according to claim 60, wherein the dig- 
ital data sequence is television broadcast, which 
broadcasts image data and sound data encoded by 
MPEG 4. 

62. The method according to claim 60, wherein the con- 
trol step includes the step of controlling the repro- 
duction pattern of the image data in units of objects 
of the image data. 

63. The method according to claim 60, wherein the con- 
trol step includes the step of identifying an object to 
be controlled on the basis of the system data. 

64. The method according to claim 60, wherein the ob- 
taining step includes the step of obtaining the cur- 
rent time from the system data. 

65. The method according to claim 60, further compris- 
ing the steps of: 

holding a plurality of reproduction patterns in a 
memory; and 

reproducing an image based on the reproduc- 
tion pattern when the reproduction pattern cor- 
responding to the time information is held in the 
memory. 

66. The method according to claim 60, further compris- 
ing: 

manually setting a layout of a predetermined 
object in correspondence with the time informa- 
tion; and 

holding the set layout in a memory together with 
object information of the predetermined object 
as information indicating the reproduction pat- 
tern. 

67. The method according to claim 66, wherein the set- 
ting step includes the step of setting at least one of 
a reproduction ON/OFF state, reproduction posi- 
tion, and reproduction size of the predetermined ob- 
ject 

68. The method according to claim 66, wherein the con- 
trol step includes the step of reading out the repro- 
duction pattern corresponding to the time informa- 
tion from the memory, and controlling a layout of an 
object corresponding to the object information, 
which is stored in correspondence with the readout 
reproduction pattern. 

69. The method according to claim 60, wherein the de- 
coding step includes the step of further decoding 
sound data from the digital data sequence, and the 



control step further includes the step of controlling 
a reproduction pattern of the decoded sound data 
on the basis of the system data and the time infor- 
mation. 

5 

70. The method according to claim 69, wherein the con- 
trol step includes the step of controlling a reproduc- 
tion level and/or sound field lateralization of a sound 
object. 

10 

71. A computer program product comprising a compu- 
ter readable medium having a computer program 
code, for a receiving method, said product compris- 
ing: 

is 

receiving process procedure code for receiving 
a digital data sequence; 
decoding process procedure code for decoding 
image data and system data from the received 
20 digital data sequence; 

obtaining process procedure code for obtaining 
a current time; and 

controlling process procedure code for control- 
ling a reproduction pattern of the decoded im- 
25 age data on the basis of obtained time informa- 

tion. 

72. A receiving apparatus comprising: 

30 reception means for receiving a digital data se- 

quence which is encoded by a first scheme and 
includes image data encoded by a second 
scheme; 

first decoding means for decoding image data 
35 and system data from the digital data sequence 

encoded by the first scheme; 
second decoding means for decoding the im- 
age data encoded by the second scheme; 
obtaining means for obtaining a current time; 
40 and 

control means for controlling a reproduction 
pattern of the image data decoded by said first 
and second decoding means on the basis of the 
decoded system data, and time information ob- 
45 tained by said obtaining means. 

73. The apparatus according to claim 72, wherein the 
first scheme is MPEG 4, and the second scheme is 
MPEG 2. 

50 

74. The apparatus according to claim 72, wherein the 
first scheme is MPEG 2, and the second scheme is 
MPEG 4. 

55 75. The apparatus according to claim 72, wherein said 
control means controls the reproduction pattern of 
the image data in units of objects of the image data. 



35 



40 



27 



53 



EP 1 018 840 A2 



54 



76. The apparatus according to claim 72, wherein said 
control means identifies an object to be controlled 
on the basis of the system data. 

77. The apparatus according to claim 72, wherein said 
obtaining means obtains the current time from the 
system data. 



decoding the image data encoded by the sec- 
ond scheme; 

obtaining a current time; and 
controlling a reproduction pattern of the image 
s data decoded in the first and second decoding 

steps on the basis of the decoded system data 
and the obtained time information. 



78. The apparatus according to claim 72, further com- 
prising a memory for holding a plurality of reproduc- 
tion patterns, and wherein said control means re- 
produces an image based on the reproduction pat- 
tern when the reproduction pattern corresponding 
to the time information is held in said memory. 

79. The apparatus according to claim 72, further com- 
prising: 

setting means for manually setting a layout of 
a predetermined object in correspondence with 
the time information; and 
a memory for holding the layout set by said set- 
ting means together with object information of 
the predetermined object as information indi- 
cating the reproduction pattern. 

80. The apparatus according to claim 79, wherein said 
setting means sets at least one of a reproduction 
ON/OFF state, reproduction position, and repro- 
duction size of the predetermined object. 



85. The method according to claim 64, wherein the first 
10 scheme is MPEG 4, and the second scheme is 

MPEG 2. 

86. The method according to claim 84, wherein the first 
scheme is MPEG 2, and the second scheme is 

is MPEG 4. 

87. The method according to claim 84, wherein the con- 
trol step includes the step of controlling the repro- 
duction pattern of the image data in units of objects 

20 of the image data. 

88. The method according to claim 84, wherein the con- 
trol step includes the step of identifyingjan,objecUg_ 
be controlled on the basis of the system data. 

25 

89. The method according to claim 84, wherein the ob- 
taining step includes the step of obtaining the cur- 
rent time from the system data. 

30 90. The method according to claim 84, further compris- 
ing the steps of: 



81. The apparatus according to claim 79, wherein said 
control means reads out the reproduction pattern 
corresponding to the time information from said 
memory, and controls a layout of an object corre- 35 
* ^spotfdirjp^ 

in correspondence with the readout reproduction 
pattern. 



holding a plurality of reproduction patterns in a 
memory; and 

reproducing an im age bas ed on the rep rod uc- 
^tion*pane L m*wfiEW 

responding to the time information is held in the 
memory. 



82. The apparatus according to claim 72, wherein said 
first and second decoding means further decode 
sound data from the digital data sequence, and said 
control means further controls a reproduction pat- 
tern of the decoded sound data on the basis of the 
system data and the time information. 

83. The apparatus according to claim 82, wherein said 
control means controls a reproduction level and/or 
sound field lateralization of a sound object. 

84. A receiving method comprising the steps of: 

receiving a digital data sequence which is en- 
coded by a first scheme and includes image da- 
ta encoded by a second scheme; 
decoding image data and system data from the 
digital data sequence encoded by the first 
scheme; 



40 91 . The method according to claim 84, further compris- 
ing: 

manually setting a layout of a predetermined 
object in correspondence with the time informa- 
45 tion; and 

holding the set layout in a memory together with 
object information of the predetermined object 
as information indicating the reproduction pat- 
tern. 

so 

92. The method according to claim 91 , wherein the set- 
ting step includes the step of setting at least one of 
a reproduction ON/OFF state, reproduction posi- 
tion, and reproduction size of the predetermined ob- 

ss ject. 

93. The method according to claim 91 , wherein the con- 
trol step includes the step of reading out the repro- 



50 



28 



55 



EP1 018 840 A2 



duction pattern corresponding to the time informa- 
tion from the memory, and controlling a layout of an 
object corresponding to the object information, 
which is stored in correspondence with the readout 
reproduction pattern. 5 

94. The method according to claim 84, wherein the de- 
coding step includes the step of further decoding 
sound data from the digital data sequence, and the 
control step includes the step of controlling a repro- 10 
duction pattern of the decoded sound data on the 
basis of the system data and the time information. 

95. The method according to claim 94, wherein the con- 
trol step includes the step of controlling a reproduc- is 
tion level and/or sound field lateralization of a sound 
object. 

96. A computer program product comprising a compu- 
ter readable medium having a computer program 20 
code, for a receiving method, said product compris- 
ing: 

receiving process procedure code for receiving 
a digital data sequence which is encoded by a 25 
first scheme and includes image data encoded 
by a second scheme; 

first decoding process procedure code for de- 
coding image data and system data from the 
digital data sequence encoded by the first 30 
scheme; 

second decoding process procedure code for 
decoding the image data encoded by the sec- 
ond scheme; 

obtaining process procedure code for obtaining 35 
a current time; and 

controlling process procedure code for control- 
ling a reproduction pattern of the image data 
decoded in the first and second decoding steps 
on the basis of the decoded system data and 40 
the obtained time information. 

97. A computer program for controlling a computer to 
carry out the method of any one of claims 6 to 10, 

17 to 21 , 29 to 34, 42 to 47, 60 to 70 or B4 to 95. 45 

98. A carrier medium carrying the computer program of 
claim 97. 



55 



29 



EP1 018 840 A2 




EP 1 018 840 A2 



CO 

o 
in 



o 





SCENE 
SYNTHE- 
SIZER 




5009 


5010 

I . 


5011 


5012 





00 

o 
o 
in 

S[ 



O CO LU 

CO O Q 



s 



LU 
CD 

< 

2 en 

~ . LU 

O^O 

X CD LU 
Q_OQ 



s 



co ^ Oo 




I — *S LU 
<UJ Zj 
□ QQ. 



CD 
O 
O 

o 



o 
o 
m 



^ cr 

CDI- 
LUCO 
CL»— 

Sao 



< =d —I 
D5Q. 



o 
o 
in 



Ogz 

CO O LU 



CM 

o 
o 
in 

sr 



CO 

o 
o 
in 



< 

1 cc 
tr lu O 

T CD Z 
D-OlU 



o 

co 



m 



F a: 
w i-Iju 

I— CD LU O 

co ^ o LU 



o 

Xlu 

CO^ 



o 
o 
in 

Sr 



\— cc 
5luO 

<=)U 
ICQZ 
OOLU 



CC 
LU 
*— 

o 
< 
ac 
<c 

o 



in 
o 
o 
m 

Sr 



& 

LU CO O 
CO O = LU 



5Q 

GCI- 

=> <c 
z: u- £ 



31 



EP1 018 840 A2 




EP1 018 840 A2 



FIG. 4 



INPUT - 



VOP 

DEFINITION 



VOPO 
ENCODING 



V0P1 

ENCODING 



V0P2 
ENCODING 



MULTI- 
PLEXING 



BITSTREAM 



FIG. 5 



BITSTREAM 



DEMULTI- 
PLEXING 



VOPO 
DECODING 



V0P1 

DECODING 



V0P2 

DECODING 




OUTPUT 



33 



EP 1 018 840 A2 



CD 



O 



o 



LU O 

qIO 
<o 

XLU 
CO Q 



CD 
UJUJ 



ORY 




Qq-O 






MOT 
COM 
SATI 



o 

2 <r| 
too 

O LU LU 



z 

X 
LU 
— I 
Q_ 



LU 
O 



(0 




CD 

LU 
-J 
DL 



UJ 
CO 



cc 

UJ 



CD 



6 



O 



a 



CD 

pi 

too 
Oujz 

5>UJ 



Sow 



z=5 

gas 



CD 

LU 9 
CL O 
< O 

CO LU 



34 



EP1 018 840 A2 



FIG. 7A 



vop O — *- 



SHAPE 


MOTION 


TEXTURE 


INFORMATION 


INFORMATION 


INFORMATION 


ENCODING 


ENCODING 


ENCODING(DCT) 



BITSTREAM 



OBJECT UNIT ENCODING 



35 



EP 1 018 840 A2 



FIG. 7B 



MOTION 

INFORMATION 

ENCODING 



TEXTURE 

INFORMATION 

ENCODING(DCT) 



BITSTREAM 



FRAME UNIT ENCODING (VLVB CORE) 



36 



EP1 018 840 A2 



FIG. 8 



b 




c 








B 






c 




a 




> 










t 








A 














- A 



a, b, c, x : QUANTIZATION COEFFICIENT OF DC COMPONENT 
A, B, C, X : QUANTIZATION COEFFICIENT OF AC COMPONENT 



37 



EP 1 018 840 A2 



FIG. 9A 



TEMPORAL SCALABILITY 

0 1 2 



FRAME NUMBER 



H h 



3 

-4- 



4 5 6 



BASE LAYER 




ENHANCEMENT 
TYPE 1 



ENHANCEMENT 
LAYER 



BASE LAYER 




ENHANCEMENT 
TYPE 2 



FIG. 9B 



SPATIAL SCALABILITY 

P-VOP B-VOP B-VOP 



ENHANCEMENT 
LAYER 



BASE LAYER 



I | f 

a — a — a 



l-VOP 



P-VOP P-VOP 



38 



EP1 018 840 A2 



FIG. 10A 



SPRITE 




FIG. 10B 



PERSPECTIVE 
TRANSFORMATION 


x'=(ax+by+c)/(gx+hy+l) 
y'=(dx+ey+f)/(gx+hy+l) 


AFFINE 

TRANSFORMATION 


x'=ax+by+c 
y'=dx+cy+f 


EQUIDIRECTIONAL 
UPSCALING(a)/ROTATION( 6 )l 
MOVEMENT(cf) 


x'=acos 6 x+asin 8 y+c 
y'=-asin 6 x+acos 6 y+f 


TRANSLATION 


x'=x+c 
y"=y+f 



39 



EP1 018 840 A2 



u. 




40 



EP 1 018 840 A2 




41 



EP1 018 840 A2 



LU 
CO 



CO 

d 

LL 



LU 
X 

CD 

Q 

O 
O 



CO 



CO 



CO 



CM 



o 

X 
> 



z 

Q 
O 
O 

o 
cc 

LU 
< 

< 



CM 



LLI 

O 
• 

GO 

5 



_J 
LU 
O 
CO 



CD 

o 

o 
o 



o 



CO 



CM 



LU 
_J 
CO 

Q_ 

2 

o 

o 

% 



o 



co 



O 
> 
c 

5 



g 

CO 

LU Z 

§g 

& 

LU t 

cc o 
go 



CO 

CO 
LU 



CO 



o 

CO 



o 

o 
o 
o 

< 

CO 



CO 
CO 



>- 

CO 



o 

CO 
3 

2 

o 

a 
O 

o 



o 

T3 



O 
O 

.a 
X 



CO 



o 

X 



s 



c7T 



a, o 



O 



CO 



o 

X 
CO 



42 



EP 1 018 840 A2 



lu < 

i <L 

i§ 



- 

LL 



GO 
CM 



CVJ 



CNJ 



CVJ 




CO 

o 

CM 



CVJ 




43 



EP1 018 840 A2 



FIG. 15 



TERMINAL DEVICE 
ELEMENTARY STREAM 



ACCESS UNIT LAYER 



PACKETIZATION 



TIME STAMP/ 
REFERENCE CLOCK 



FlexMux LAYER 



TransMi 


ix LAYER 







ERROR 

CORRECTION CODE 



MULTIPLEXING SUB LAYER 



TransMux STREAM 

TRANSMISSION PATH 



44 



EP1 018 840 A2 



FIG. 16 




CLASS 



VIDEO SESSION(VSo ) 


(VS t )- 


















VIDEO OBJECT(VOq) 


(VO,)- 




















VIDEO OBJECT LAYER(VOL 0 ) 


(VOL,)- 








GROUP OF VIDEO OBJECT PLANE(GOV 0 ) 


(GOV,)- 






VIDEO OBJECT PLANE(VOP 0 ) 


(VOP,)- 



VS : Video Session 

VO : Video Object 

VOL : Video Object Layer 

GOV : Group Of Video Object Plane 

VOP : Video Object Plane 



45 



EP1 018 840 A2 



FIG. 17 




DECODING UNDECODABLE — DISCARD 



~£ 

(ERROR) 

IRREVERSIBLE DECODING BY NORMAL VLC 



DECODING UNDECODABLE — DISCARD 











X X 





(ERROR) (ERROR) REVERSE DECODING 



46 



EP 1 018 840 A2 



00 

d 

Li. 



cr 

LU 
CO 

< 



o 

5 

o 
cr 

o 

CO 



cr 
O 



cc 

LU 

CO 



Omco> 



O v LU 
OCrjCQ 

<^=> 
2CDZ 



Omco> 



6 v: uu 

CCqCD 
5CQZ 



UJ 
O 
—J 
CO 



A 

o 



< 

CL 



< 



cr 

LU 

< 



o 

5 



o 

o 

I— 

LU 
CO 



LU < 

kg 



o 

si 

i-o 



47 



EP1 018 840 A2 



CO 



CM 



O 
CO 



CO 



d 

Li. 





> 




QC 




o 








LU 




o 




o 




o 


OUND 

ECODIN 

IRCUIT 




...it 


CO 


3 it 


CO 


/IAGE 
ECO 

IRCU 


00 ~~ 


YSTE 
ATA 

ECOI 
IRCL 


woo 








COOQO 



CM 



CD 
CM" 



C\J 



o 

LU 
»- 

3" 



CO 



MULTIPLEXED 
SIGNAL 

DEMULTIPLEXING 
CIRCUIT 



ERROR 

CORRECTION 

CIRCUIT 




CM 



DEMODULATION 
CIRCUIT 




48 



EP1 018 840 A2 



FIG. 20 

301 302 



(X 0 ,Y 0 ) . ^ 


















? 




(AX, AY) 








CORRECTION (SHIFT) 





(X,Y) 

NEW SETUP (REPLACE) 



(X'.V) 



306 OBJECT 



FIG. 21 



□□□□ 
□□□□ 
□□□□ 

-4- 



308 NEW SETUP 
307 SHIFT 




304 



49 



EP1 018 840 A2 




50 



EP1 018 840 A2 




51 



EP1 018 840 A2 



< 

g 

52 



a. 

Lua: 

LU CO 
OLU 
COO 



LU 

3 
o 



CM 



O 

ll 



o 

LU 

3 
o 



CO 

H- 
O 
LU 

o 



CM 

O 
LU 
—> 
CO 

o 



o 

LU 

3 

o 



52 



EP 1 018 840 A2 



FIG. 25 



( START ) 



RECEIVE TV INFORMATION 



-~S1 



DETECT PROGRAM ID 




YES 



SELECT OBJECT 



ADJUST LAYOUT 




STORE LAYOUT 
SETTING DATA 



S13 



MAKE VIDEO DISPLAY BASED 
ON LAYOUT SETTING DATA 



( RETURN ) 



-S6 



S8 

_L 

DISPLAY 
VIDEO IN 
BASIC 
LAYOUT 



53 



EP 1 018 840 A2 



FIG. 26 



MPEG2 
BITSTREAM 



/ 



.61 



SOUND 



PHOTO 
IMAGE 



SYNTHETIC 
IMAGE 



CHARACTER 



SCENE 

CONFIGURATION 
INFORMATION 



5001 



SOUND 

OBJECT 

ENCODER 



5002 



PHOTO 
IMAGE 
OBJECT 
ENCODER 



5003 
_l 



SYNTHETIC 
IMAGE 
OBJECT 
ENCODER 



5004 



CHARACTER 

OBJECT 

ENCODER 



5005 



SCENE 
DESCRIPTION 
INFORMATION 
ENCODER 



DATA 

MULTIPLEXER 



MPEG4 
BITSTREAM 

(TRANSMISSION) 



5006 



54 



EP1 018 840 A2 



FIG. 27 



62 





MPEG2 






DECODER 





5008 



MPEG4 
BITSTREAM 
(TRANSMISSION)- 



cc 

LU 
X 



< Ul 



SOUND 
OBJECT 
DECODER 



5009 



PHOTO 
IMAGE 
OBJECT 
DECODER 



5010 



SYNTHETIC 
IMAGE 
OBJECT 
DECODER 



5011 



CHARACTER 

OBJECT 

DECODER 



5012 



SCENE 
DESCRIPTION 
INFORMATION 
DECODER 



CO 
LU 



Uj; 



O >- 



SCENE 

-INFORMATION 
OUTPUT 



5007 



5013 



55 



EP1 018 840 A2 




56 



EP 1 018 840 A2 




57 



EP 1 018 840 A2 







> 








MO 




LU 




2 



SO Q 

f— P CO 
OOQOO 



O O 

lucc uj== 
zo<>g 

oaj<o? 

WQQOO 



00 



SOUND 
DECODING - 
CIRCUIT 


C\J 

CO ~ 


IMAGE 

DECODING 

CIRCUIT 


CO 


SYSTEM 
DATA 
DECODING 
CIRCUIT 



1^ 

C\T 



CO. 
CM 



in 

CM' 



I 



MULTIPLEXED 
SIGNAL 

DEMULTIPLEXING 
CIRCUIT 




ERROR 

CORRECTION 

CIRCUIT 




CM 



DEMODULATION 
CIRCUIT 



CO 

d 




58 



EP 1 018 840 A2 




59 



EP1 018 840 A2 




60 



EP 1 018 840 A2 




61 



EP1 018 840 A2 




62 



EP1 018 840 A2 



FIG. 35 



f START LAYOUT ^ 
V SETTING MODE J 



SELECT OBJECT ■ — S21 



LAY OUT SELECTED OBJECT ■ — S22 



S23 




SAVE LAYOUT SETTING DATA 
POSITION DATA 
CATEGORY INFORMATION 
OBJECT INFORMATION 
CONTROL DATA 



S24 



Q END ^ 



63 



EP1 018B40 A2 



FIG. 36 



( START ^ 
I DISPLAY MODE J 



RECEIVE TV BROADCAST - S31 



DETECT CATEGORY 
INFORMATION 



- S32 



S34 

_L_ 



LAYOUT 

"SETTING DATA CORRESPONDING 
[0 CATEGORY INFORMATIOI 
STORED? 



S33 



YES 



READ OUT LAYOUT SETTING 
DATA FROM MEMORY 



- S35 



DISPLAY VIDEO 
IN BASIC LAYOUT 



DISPLAY VIDEO 
IN SET LAYOUT 



-S36 



64 



EP 1 018 840 A2 



CO 



CM ~ 



O 

CO 




CO 







>- 


GC 




cc 


O 


__CVJ 


O 








LU 




LU 









<C __C0 



o 



LU y 



o 

CO 

<: GC LU=r 

ILU<OS 

OOQOO 



CO * 



O 

COOOOO 



CO 



o 



ft! 

CO 



LU 



< uu 9^ 



o 

^ LU 



C\T 



CD 
CVT 



in 



s 



CD 
CO 




oo 

CO 



cog 
COO 



UJ LU O 



Oh 

p 

COCL 



LU 



cOh:P 



I 



COQQO 

J 



Oo 

-I 

CD < 

o 



MULTIPLEXED 
SIGNAL 

DEMULTIPLEXING 
CIRCUIT 



O 

pen 

P O 



CO 
CM 



co 



ERROR 

CORRECTION 

CI RCUIT 




CVJ 



DEM0DULATI0N 
CIRCUIT 




65 



EP 1 018 840 A2 



FIG. 38 




95 — 



'Sain 




Gain 


R 




I 


L 



LAYOUT 
SETTING 
DATA 



96 



SYSTEM 
CONTROLLER 



- 94 



66 



* 



EP1 018 840 A2 



FIG. 39 




67 



EP 



1 018 840 A2 




68 



EP 1 018 840 A2 




69 



EP 1 018 840 A2 




70 



EP1 018 840 A2 



/On 



CO 

(3 

LL 




• 


• 
• 








• 

* 
* 










• 

• 

• 








• 












• 


• 






LU 










h~ 




m 


















• 


m 






o 


• 


• 






QC 












LU 


1— 










X 




LU 






















O 


h- 








LU 


o 
























UJ 


LU 
QC 


LU 
—1 




• 








O 








• • • 




H— 


O 

CO 






• 
• 






















< 


























CO 


Id 


o 














X 






• • • 




LU 


CO < 


CO 












PROFE 
BASEB 


WIDE 











> 



LU 
Q 

o 
o 

LU 
3 

o 



Q 
O 
O 

LU 
X 



71 



EP 1 018 640 A2 



Q 



BROADCAST 
STATION DATA 


BROADCAST 
STATION DATA 


BROADCAST 
STATION DATA 


BROADCAST 
STATION DATA 


BROADCAST 
STATION DATA 


BROADCAST 
STATION DATA 


BROADCAST 
STATION DATA 


BROADCAST 
STATION DATA 


CONTROL 
DATA 


CONTROL 
DATA 


CONTROL 
DATA 


CONTROL 
DATA 


CONTROL 
DATA 


CONTROL 
DATA 


CONTROL 
DATA 


CONTROL 
DATA 


DEFAULT 
POSITION DATA 


DEFAULT 
POSITION DATA 


DEFAULT 
POSITION DATA 


DEFAULT 
POSITION DATA 


SET POSITION 
DATA 


SET POSITION 
DATA 


SET POSITION 
DATA 


SET POSITION 
DATA 


OBJECT 
INFORMATION 


OBJECT 
INFORMATION 


OBJECT 
INFORMATION 


OBJECT 
INFORMATION 


OBJECT 
INFORMATION 


OBJECT 
INFORMATION 


OBJECT 
INFORMATION 


OBJECT 
INFORMATION 


7:00 — 9:00 


22:00 — 24:00 


6:30 — 8:00 
MONDAY, TUESDAY, 
WEDNESDAY, 
THURSDAY, FRIDAY 


6:30 — 8:00 
SATURDAY, MONDAY 


19:00 — 21 :00 
MONDAY 


21 : 00 — 22:00 
WEDNESDAY 


12:00—13:00 
MONDAY, WEDNESDAY, 
FRIDAY 


7:30 — 8:30 


GOOD 
MORNING 


GOOD 
NIGHT 


GO OUT 


HOLIDAY 


USER 1 


USER 2 


USER 3 


USER 4 



^, — LU 



u_ 

LU _ 
Q CO 



CD 



mi L_ Q 
coujO 
Dco5 



72 



EP 1 018 840 A2 



FIG. 45 



f START LAYOUT^ 
V , SETTING MODE J 











INPUT TIME BAND 














SELECT TARGET OBJECT ■ 










LAY OUT SELECTED OBJECT - 


NO 


_^-^-^/S44 
^ END OF ^^^^ 



-S41 



YES 



SAVE LAYOUT SETTING DATA 
POSITION DATA 
OBJECT INFORMATION 
CONTROL DATA 
TIME BAND 



-S45 



( END ) 



73 



EP1 018 840 A2 



FIG. 46 



S54 



DISPLAY VIDEO 
IN BASIC LAYOUT 



( START \ 
V DISPLA Y MODE J 



RECEIVE TV BROADCAST j— S51 

-S52 



DETECT TIME 
INFORMATION 



LAYOUT 

"SETTING DATA CORRESPONDING 
TO CURRENT TIME 
STORED? 



S53 



YES 



REPRODUCE LAYOUT 
SETTING DATA 



~S55 



DISPLAY VIDEO 
IN SET LAYOUT 



- S56 



74 



4 

f 



EP1 018 840 A2 




CD 

•z 
u. 

LI- LU 

coca 



<: 

O 

o 



DC 
O 
CO 
Q_ 



/ 



/ 



/ 



\ 



\ 



\ 



\ 



< 
Q 

CO 
ZD 
CO 



So 

OJ2- 



□3 



O ^ 



il 



a. 

cr 



LU 
O 
CO 



o<S 

£r < I— 
5DCO 



^2: 



ex 



cr 
O 

CL h- 
COUJ 

<o 

I— CL 



clO 

< — 1 

Q UJ 

< U_ 



75 



This Page is Inserted by IFW Indexing and Scanning 
Operations and is not part of the Official Record 

BEST AVAILABLE IMAGES 

Defecti ve images within this document are accurate representations of the original 
documents submitted by the applicant. 

Defects in the images include but are not limited to the items checked: 

□ BLACK BORDERS 

□ IMAGE CUT OFF AT TOP, BOTTOM OR SIDES 
QT/FADED TEXT OR DRAWING 

□ BLURRED OR ILLEGIBLE TEXT OR DRAWING 

□ SKEWED/SLANTED IMAGES 

□ COLOR OR BLACK AND WHITE PHOTOGRAPHS 

□ GRAY SCALE DOCUMENTS 

□ LINES OR MARKS ON ORIGINAL DOCUMENT 

□ REFERENCE(S) OR EXHIBIT(S) SUBMITTED ARE POOR QUALITY 

□ OTHER: 

IMAGES ARE BEST AVAILABLE COPY. 
As rescanning these documents will not correct the image 
problems checked, please do not report these problems to 
the IFW Image Problem Mailbox. 



